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FORWARD ERROR CORRECTING SYSTEM WITH ENCODERS CONFIGURED IN PARALLEL AND/OR SERIES 

5 This non-provisional patent application claims the benefit under 35 U.S.C. Section 119(e) of United States 

Provisional Patent Application No, 60/094,629, filed on July 30.1998, and Provisional Patent Application No, 60/098,394, filed 
on August 30,1998, and Provisional Patent Application No. 60/133,390, filed on May 10, 1999, all of which are incorporated 
herein by reference. 

Field of the Invention 

v 1 

1 0 The present invention relates to the use of forward error correction techniques in data transmission over wired and 

wireless systems using an optional Reed-Solomon encoder as an outer encoder and a multiple concatenated convolutional 
encoder (in serial or parallel configuration) as an inner encoder. A preferred embodiment of the invention pertains particularly 
to ADSL systems, as a representative species of wired-based systems. 

Background of the Invention 

1 5 The invention is based on use of a multiple concatenated convolutional encoder in serial or in parallel configuration. 

In the serial case it is called Serial Multiple Concatenated Convolutional Code (SMCCC), and in the parallel case is called 
Parallel Multiple Concatenated Convolutional Code (PMCCC). This gives an extra redundancy to the signal in a way that 
improves the performance of the codification (increasing the coding gain). 
Theory of Trellis Coding 

20 Modulation constellations of more than 2 points (such as Quadrature Amplitude Modulation (QAM), and Quaternary 

Phase Shift Keying (QPSK)) are used to increase the bit rate at the cost of smaller Euclidean distances (distance between 
adjacent points in a signal constellation). Coding techniques are used to decrease transmission errors, when transmitting over 
power-limited channels. 

Trellis Coding combines coding and modulation to improve bit error rate performance. As in other forms of forward 
25 error correction, the basic idea behind Trellis Coding is to introduce controlled redundancy in order to reduce channel error 
rates. What sets Trellis Codes apart is that this technique introduces redundancy by doubling the number of signal points in the 
QAM constellation (partitioning). BPSK and QPSK signal constellations are show in Figures 1 and 2. 

Hie actual (noisy) received signal will tend to be somewhere around the "correct" signal point. The receiver chooses 
the signal point closest to the noisy received signal. As more points are added to the signal constellation and the power is kept 
30 constant, the probability of error increases, because the Euclidean distance (distance between adjacent signal points) "d" is 
decreased and the receiver has a more difficult job making the correct decision. Thus it would make sense, that the Euclidean 
distance "d" dominates the probability of error expressions. 

Since the power is the same for both constellations, the required energy is also the same. The signal points in BPSK 
are d « 2 apart and d = 1.414 apart for QPSK. 
35 The following expressions are for probability of error ( for BPSK): 

u 2>ho j 



P. = ^ erfc 



Substituting for "d" in the above: 



(BPSK) 



- - yjno j 
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where erfc is the complementary error function. 

Trellis Coding expands on this concept to increase the Euclidean path distance for a more thorough derivation of the 
probability of error. With the exception of the constant factor in front of the complimentary error function, it should be noted 
that both error expressions depend on the signal spacing d and that the probability for QPSK errors is higher (not surprising 
5 since the signal spacing is smaller). Trellis coding enables us to recover from this increase in probability of error. 

M-QAM and PSK normally use a signal set of\4=2k symbols in order to reduce the symbol rate by a factor of M 
Examples of A<fW QAM and QPSK are show in Figures 3 and 4 respectively. 

Doubling the number of signal points in order to support two state Trellis Coding, we get the signal constellations for 
two stale Trellis Coding shown in Figures 5 and 6. 
1 0 Thus, Trellis Coding uses 2*K i possible symbols for the same factor-of-M reduction of bandwidth (and each signal is 

still transmitted during the same signaling period). 

Trellis Coding provides controlled redundancy, by doubling the number of signal points. In addition, Trellis coding 
defmes the way in which signal transitions are allowed to occur. (Signal transitions that do not follow this scheme will be 
detected as errors). 

1 5 This is best explained using the Trellis Coded 8-PSK example. The 8-PSK signal constellation is show in Figure 7, 

where we can see the individual signal points. 

Note that the signal point labels "0,1,2,3,4,5,6,7" do not correspond to the actual data being sent. They are only 
convenient ways to label the signal points and keep from cluttering up the graphics. 

Without coding, the performance in 8-PSK depends on d 0 (do~2 sin (71/8)^0. 765), which corresponds to a higher bit 

20 error rate than QPSK (dj=L4I4). By using Trellis Coding, it is possible to improve the performance by restricting the way in 
which signals are allowed to transition. 

First the states of the trellis are defined. Lets label one state as "0426", and the other state as M 1537". Each digit 
refers to one of four permitted signal points in the state (state points), with each state by itself representing a QPSK 
constellation, with each stale's constellation being offset by 45 degrees from the other. 

2 5 Figure 8 describes a two-state trellis, 8-PSK system. If the system is in state "0426" only one of these four state 

points is used. If a "0 M or "4" is transmitted, the system remains in the same state. If however, a "2" or "6" is transmitted the 
system switches to the "1 537" state. Now, only one of these four state points is used. If a "3" or "7" is transmitted, the system 
remains in this state, otherwise if a "1" or a "5" is transmitted, it switches back to the "0426" state. Again, note that the system 
in each symbol represents two bits, so that when switching states, the "QPSK constellation is shifted by 45 degrees". 

30 Assuming that all input signals are equally likely, all signal paths are traced out over time. Just as we had for non- 

Trellis coding, the received signal includes noise and will tend to be located somewhere around the state points. The receiver 
again has to make a decision based on which signal point is closest and a mistaken output state value will be chosen if the 
receiver made an incorrect decision. 

In order to illustrate error events, lets assume that the transmitter is sending continuous "7 M symbols. Figure 9 

35 illustrates the possible error events. In this case "5" followed by w 6" is received instead of the transmitted "7" - "7" sequence. 
The Euclidean mean-squared distance for this path is the sum of the squares of the distance of each interval (see Figure 7 for an 
illustration of the Euclidean distances and Figure 8 for the Trellis Diagram): 

4d 2 (7,5) - d 2 (7,6) - + di - - [2 sin = 1.608 

where d( 7,5) and d( 7,6) are the Euclidean distances between these two the signals "7" and "5", and "7" and "6" respectively. 
40 Figure 10 show when case "1" followed by "6" is received instead of the transmitted "7" - "7" sequence. The 

Euclidean distance (see Figure 7 for an illustration of the Euclidean distances) for this path: 

2 
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4d 2 (7,l) + d 2 (7 t 6) = Jd? + do 2 = j 2 + [ 2 sin (f )) 



J. 608 



Figure 11 show when case "5" followed by "2" is received instead of the transmitted "T - "T sequence. The 
Euclidean distance (see Figure 7 for an illustration of the Euclidean distances and Figure 8 for the Trellis Diagram) for this 
path is: 



jd 2 P.5) + d 2 (7,2) = Jd? + dj 



2.33 



Figure 12 show when case "1" followed by "2" is received instead of the transmitted "7" - "7" sequence. The 
Euclidean distance (see Figure 7 for an illustration of the Euclidean distances and Figure 8 for the Trellis Diagram) for this 
path is: 

4d ! (7,l) + d 3 (7.2) = ^df + di = ^2 + [2 sin = 2.33 

The only remaining error event is the single interval **3" instead of "7" error event, which has a Euclidean distance of 
2 (see Figure 7). 

Because of their large Euclidean distance (233), the "5" - "2" and "1" - "2" error events are least likely. The "1" - "6" 
and "5" - "6" error events are most likely because of their low Euclidean distance (1.608). 

The minimum Euclidean distances for a trellis is the minimum free Euclidean distance Vf" (similar to the minimum 
free distance in convolutional coding). For the above example dE- 1-608. 

Since dg is a measure of the closest spacing between adjacent state points (and therefore also more likely to cause 
errors), it dictates the lower bound for probability of error for the entire Trellis in the following way: 

4e 



2 yfno 



where, a (d£)\s the number of error paths at distance de- In the 2 states trellis example, there are 2 error paths at a distance of 
dg. Therefore the probability of error is: 

F 1.608 fY 
2 jiio 



P e £ erfc 



We found the probability of error for regular (non Trellis coded) QPSK to be: 

r 1.404 fF 
P e - erfc 

The improvement of two state Trellis Coding over QPSK is therefore 1.608/1 .4 14 or 1.1 dB. 

This is a low coding gain for the amount of overhead required to handle Trellis coding. One might ask, is there a way 
to increase the coding gain obtainable with Trellis Coding? There certainly is. First, it is possible to increase the number of 
trellis states above 2, such as the four-state trellis shown in Figure 13. Note that the permitted state transitions are only drawn 
for column "I". The same state transitions are gain permitted in column "II", "HI", and so on. 

Increasing the number of states, one also increases the Euclidean distances. For example, in the above four state 
trellis coding case, all error events have a Euclidean distance of more than 1 (single "4" error). This single " 4" error, is the 
only "lowest" error event with minimum Euclidean path distance ds 



3 
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Therefore, the lower bound tor 4 state Trellis Coding is: 



P.>-erfc 



Comparing this equation to the equation for regular QPSK, we have a coding gain of 2:1 or 3 dB. 
Table 1 illustrates further coding gains that can be obtained by using even more states in the trellis: 
Table 1. Trellis Coding Gain vs. Number of Trellis States 



# of Trellis States 


Coding Gain 


4 


3.54 


8 


4.01 


16 


4.44 


32 


5.13 


64 


5.33 


128 


5.33 


256 


5.51 



Another way to improve coding gain in Trellis Coding is to go to more than 2 dimensions. 

Summary of the Invention 

The present invention comprises forward error correction tecliniques in data transmission over wired systems using 
10 an optional Reed Solomon encoder as an outer encoder and a multiple concatenated convolutional encoder (MCCC) (in serial 
or parallel configuration) as an inner encoder. With an ''optional" Reed-Solomon outer encoder we mean that it could be 
present or not. We describe its application to ADSL DMT (Discrete Multi-Tone, multiple-carrier) based systems. The 
extension to CAP/QAM (single-carrier) based, other xDSL systems (HDSL, VDSL, HDSL2, etc.), other wired communication 
systems, wireless systems and satellite systems is straightforward. 
1 5 ADSL modems are designed to operate between a Central Office CO (or a similar point of presence) and a customer 

premises CPE. As such they use existing telephone network wiring between the CO and the CPE. There are several modems 
in this class which function in generally similar manner. All of these modems transmit their signals usually above the voice 
band. As such, they are dependent on adequate frequency response above voice band. 

With the technique that we propose in this invention, it is possible to reach longer loops or reduce the transmitter 
20 power for ADSL systems. 

For wireless systems it is possible to reduce the power consumption, increase the coverage area and to extend the life 
of the portable systems. 

For satellite systems it is possible to increase the G/T factor around 4 dB, to increase the life of the satellite, to 
increase the coverage area and to reduce the requirements of the terrestrial systems. 
25 With the use of the Trellis Coded Modulation (TCM) it is possible to obtain coding gains between 3 and 6 dB 

(depending on the dimension of the trellis). Using the technique that we propose, the performance of this technique is within 1 
dB from the Shannon limit, at a bit error probability of 10" 7 

MCCC achieves near-Shannon-limit error correction performance. We have done some simulations that show bit 
error probabilities as low as 10' 5 at Et/N o =0.6 dB. PMCCC yield very large coding gains (around 10 or 11 dB). hi the PMCCC 
30 case, after this value of 10 s . the role of the interleaver is very critical and to avoid the floor-error it is necessary to make a 
good design of the interleaver. hi our design the Reed-Solomon outer encoder help to take care of this floor-error lower than 

io- 7 . 
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In this invention, we present simulation results, and we compare the Reed-Solomon encoder (for R=8 and R=16) with 
the Trellis plus Reed-Solomon (T+R=8 and T+R=16) and the two PCCC plus Reed-Solomon (TC+R=8 and TC+R=16). In all 
this cases will not take into account the payload of the Reed-Solomon code, because will have the same effect in all coding 
techniques. 

5 These results, we present, are for Gaussian noise. In many wire-line systems, broadband impulse noise is also a 

significant transmission impairment. Although we have not modeled impulse noise effects in this analysis, in DMT systems, 
impulse noise whose duration is short compared to the frame size appears to be rather Gaussian-like, since it passes through a 
DFT in the receiver. Furthermore, because the noise is broadband, the noise energy in the signal band is distributed among the 
various frequency bins. Thus the additional immunity against additive white Gaussian noise provided by the trellis code should 
10 be beneficial for impulse noise as well. 

We present an encoder, decoders and some simulation results. 

Brief Description of the Several Views of the Drawings 
Figure 1 shows a BPSK signal constellation; 
Figure 2 shows a QPSK signal constellation; 
1 5 Figure 3 shows a QAM signal constellation with M=4; 

Figure 4 shows a QPSK signal constellation; 

Figure 5 shows a QAM signal constellation with M=8 (Used with two state Trellis); 

Figure 6 shows a PSK signal constellation with M=8 (Used with two state Trellis); 

Figure 7 shows a signal constellation with M=8 (Used with two state Trellis); 
20 Figure 8 shows a two-state Trellis 8-PSK system; 

Figure 9 shows an error Event "5" — > "6" in a 2 states Trellis encoding; 

Figure 10 shows an error Event "1" — ► **6" in a 2 states Trellis encoding; 

Figure 1 1 shows an error Event "5" — ► "2" in a 2 states Trellis encoding; 

Figure 12 shows an error Event "I" — ► "2" in a 2 states Trellis encoding; 
25 Figure 13 shows a four-state Trellis, 8-PSK system; 

Figure 14 shows a serial Concatenated (n,k,N) block code; 

Figure 15 shows the action of a uniform interleaver of length 4 on sequences of weight 2; 

Figure 16 shows a serially Concatenated (n,k,N) Convolutional code; 

Figure 1 7 shows a code sequence in A thj, 
30 Figure 18 shows an analytical bounds for SMCBC1 for N = 4, 40, 400 and 4000, 

Figure 1 9 shows an analytical bounds for SMCBC2 for N = 5, 50, 500 and 5000; 

Figure 20 shows an analytical bounds for SMCBC3 for N = 7, 70, 700 and 7000; 

Figure 21 shows an analytical bounds for SMCCC1 for N = 200, 400, 600, 800, 1000 and 2000; 

Figure 22 shows an analytical bounds for SMCCC2 for N=200, 400, 600, 800, 1000, 2000; 
35 Figure 23 shows an analytical bounds for SMCCC3 for N=2QG, 400, 600, 800, 1000, 2000; 

Figure 24 shows an analytical bounds for SMCCC4; 

Figure 25 shows a PMCCC; 

Figure 26 shows a transmission system structure, 

Figure 27 shows notations in a transmission system structure; 
40 Figure 28 shows a PMCCC of three convolutional codes; 

Figure 29 shows a signal flow graph for extrinsic information; 

Figure 30 shows an iterative decoder structure for three parallel concatenated codes; 

Figure 3 1 shows an iterative decoder structure for two parallel concatenated codes: 

5 
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Figure 32 shows a convergence of turbo coding: bit-error probability versus number of iterations for various Et/N 0 
using the SW2-BCJR algorithm; 

Figure 33 shows a convergence of turbo coding: bit-error probability versus number of iterations for various Ei/No 
using the SWAL2-BCJR algorithm; 

Figure 34 shows a bit-error probability as a function of die bit signal-to-noise ratio using the SW2-BCJR and 
SWAL2-BCJR algorithms with five iterations; 

Figure 35 shows a number of iterations to achieve several bit-error probabilities as a function of the bit 
signal-to-noisc ratio using the SWAL2-BCJR algorithm; 

Figure 36 shows a Number of iterations to achieve several bit-error probabilities as a function of the bit 
signal-to-noise ratio using the SW2-BCJR algorithm; 

Figure 37 shows a basic structure for backward computation in the log-BCJR MAP algorithm; 

Figure 38 shows a Trellis Termination; 

Figure 39 shows an example where a block interleaver fails to "break" the input sequence; 
Figure 40 shows the two PMCCC performance, r = '/*; 
Figure 4 1 shows performance with short block sizes; 
Figure 42 shows three-code performance; 

Figure 43 shows a comparison of SMCBC and PMCBC with various interleaver lengths chosen so as to yield the 
same input decoding delay; 

Figure 44 shows a comparison of SMCCC and PMCCC with four-state MCCs; 

Figure 45 shows Block diagram of a parallel concatenated convolutional code (PMCCC) (a) a PMCCC rate =1/3 (b) 
iterative decoding of a PMCCC; 

Figure 46 shows Block diagram of a serial concatenated convolutional code (SMCCC) (a) an SMCCC rate =1/3 (b) 
iterative decoding of an SMCCC; 

Figure 47 shows a trellis encoder, 

Figure 48 shows an edge of the trellis section; 

Figure 49 shows the soft-input soft-output (SISO) model; 

Figure 50 shows the convergence of PMCCC -decoding: bit error probability versus the number of iterations using the 
ASW-SISO algorithm; 

Figure 51 shows the convergence of iterative decoding for a serial concatenated code: bit error rate probability versus 
number of iterations using the ASW-SISO algorithm; 

Figure 52 shows a comparison of two rate 1/3 PMCCC and SMCCC. The curves refer to six and nine iterations of the 
decoding algorithm and to an equal input decoding delay of 16,384; 

Figure 53 shows a block diagram for a modem transmitter in accordance with this invention, for the Central Office 
and for STM transport; 

Figure 54 shows a block diagram for a modem transmitter in accordance with this invention, for the Central Office 
and for ATM transport; 

Figure 55 shows a block diagram for a modem transmitter in accordance with this invention, for die Remote modem 
and for STM transport; 

Figure 56 shows a block diagram for a modem transmitter in accordance with this invention, for die Remote modem 
and for ATM transport; 

Figure 57 shows an ATU-C functional interfaces for STM transport at the V-C reference point; 
Figure 58 shows an ATU-C functional interfaces to the ATM layer at the V-C reference point; 
Figure 59 shows an ATM cell delineation state machine: 
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Figure 69 shows an Example implementation of the A 2 f measurement; 

Figure 61 shows an ADSL superframe structure - ATU-C transmitter. 

Figure 62 shows a fast synchronization byte ("fast byte") format - ATU-C transmitter. 

Figure 63 shows an interleaved synchronization byte ("sync byte") format - ATU-C transmitter 

Figure 64 shows a fast data buffer - ATU-C transmitter, 

Figure 65 shows an interleaved data buffer, ATU-C transmitter 

Figure 66 shows a scrambler, 

Figure 67 shows a tone ordering and bit extraction example (without trellis coding); 

Figure 68 shows a tone ordering and bit extraction example (with trellis coding); 

Figure 69 shows a conversion of w to v and w, 

Figure 70 shows a finite state machine for Wei's encoder, 

Figure 71 shows a convolutional Encoder, 

Figure 72 shows a trellis diagram; 

Figure 73 shows a constellation labels for b = 2 and b = 4; 

Figure 74 shows an expansion of point « into the next larger square constellation: 
Figure 75 shows a constellation labels for 6 = 3; 
Figure 76 shows a constellation labels for 6 = 5; 
Figure 77 shows a MTPR test; 

Figure 78 shows an ATU-R functional interfaces for STM transport at the T-R reference point; 

Figure 79 shows an ATU-R functional interfaces to the ATM layer at the T-R reference point; 

Figure 80 shows a fast data buffer - ATU-R transmitter, 

Figure 8 1 shows an interleaved data buffer - ATU-R transmitter, 

Figure 82 shows two parallel Concatenated convolutional Encoder, 

Figure 83 shows a conversion of u to v and w in the PMCCC encoder. 

Figure 84 shows a decoder for PMCCC; 

Figure 85 shows the convergence of "constellation" interleaver for PMCCC; 

Figure 86 shows an interleaver for PMCCC; 

Figure 87 shows a Serial Convolutional Concatenated Encoder, 

Figure 88 shows a decoder for SMCCC; 

Figure 89 shows an interleaver for SMCCC; 

Figure 90 shows the Convolutional Concatenated Encoder used for simulations; 

Figure 91 shows simulations for PMCCC; and, 

Figure 92 shows the Convolutional encoder uses for simulations. 
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1 . Performance Analysis, design and iterative decoding of SMCCC and PMCCC. 
1 . 1 Analytical Bounds to the Performance of Serially Multiple Concatenated Codes, 
15 1.1.1. Serially Multiple Concatenated Block Codes fSMCBCl 

The scheme of two serially concatenated block codes is shown in Figure 14. It is composed of two cascaded CCs, the 
outer (N.k) code C 0 with rate F° c ~ k/N and the inner (n, N) code G with rate R' c = Af/w, linked by an interleaver of length N. 
The overall SMCBC is then an (n, k) code, and we will refer to it as the (n, k, N) code C s , including also the interleaver length. 
In the following, we will derive an upper bound to the ML performance of the overall code Cs. We assume that the CCs are 
20 linear, so that the SMCBC also is linear and the uniform error property applies, i.e., the bit-error probability can be evaluated 
assuming that the all-zero codeword has been transmitted. 

A crucial step in the analysis comprises of replacing the actual interleaver that performs a permutation of the //input 
bits with an abstract interleaver called "uniform interleaver**. This abstract interleaver is defined as a probabilistic device that 

maps a given input word of weight / into all distinct permutations of it, with equal probability/? =//^ J (see Figure 15). The 

25 output-word of the outer code and the input word of the inner code share the same weight. Use of the uniform interleaver 
permits the computation of the "average" performance of SMCBCs, intended as the expectation of the performance of 
SMCBCs using the same MCCs, taken over the ensemble of all interleaves of a given length. It can be proof the 
meaningfulness of the average performance, in the sense that there will always be, for each value of the signal-to-noise ratio, at 
least one particular interleaver yielding performance better than or equal to those of the uniform interleaver. 

30 Let us define the input-output weight enumerating function (IOWEF) of the SMCBC Cs as 

A C *(WM) = T.A% h IV W H h (1) 

w.h 

where A^ h is the number of codewords of the SMCBC with weight /; associated with an input word of weight w. We also 
define the conditional weight enumerating function (CWEF) A Cs f w *W ot the SMCBC as the weight distribution of 
codewords of the SMCBC that have input word weight w. It is related to the IOWEF by 

A c, (wM) . (± s~ac-0v.hA 

With knowledge of the CWEF, an upper bound to the bit-error probability of the SMCBC can be obtained in the form 

Pb (e) < f Z yA c '(":ty) , (3) 
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where R c = k/n is the rate ofO, andEi,/N 0 is the signaMo-noise ratio per bit. 

The problem comprises in the evaluation of the CWEF of the SMCBC from the knowledge of the CWEFs of the outer 
and inner codes, which we call A C °(™,L) and A c, fl, ty - To do this, we exploit the properties of the uniform interleaver, 

which transforms a codeword of weight / at the output of the outer encoder into all its distinct ^ permutations. As a 
consequence, each codeword of the outer code C 0 of weight /, through the action of the uniform interleaver, enters the inner 
encoder generating ^ codewords of the inner code C. Thus, the number A % of codewords of the SMCBC of weight h 
associated with an input word of weight w is given by 

= Z (4) 



From Equation (4), we derive the expressions of the IOWEF and CWEF of the SMCBC: 

1=0 



n A c ' (IV. I) x A C '(1.H) 
A C *(W,H) = Z j-jr ■ (6) 

~ i") 

where A Co (WJ) is the conditional weight distribution of the input words that generate codewords of the outer code of weight 
/. 

1 . 1.2. Serially Multiple Concatenated Convolutional Codes 

The structure of a serially multiple concatenated convolutional code (SMCCC) is shown in Figure 16. It refers to the 
case of two convolutional CCs, with the outer code C Q with rate H° c = k/p y and the inner code C, with rate fl' e = pm. joined by 
an interleaver of length A' bits. In this way they generating an SMCCC Cs with rate R c - k/n. Note that N must be an integer 
multiple of p. We assume, as before, that the convolutional CCs is linear, so that the SMCCC is linear as well, and the 
uniform error property applies. The exact analysis requires the use of a hypertrellis having as hyperstates pairs of states of outer 
and inner codes. The hyperstates S 0 and S , m are joined by a hyperbranch that comprises of all pairs of paths with length N/p 
that join states s f and s t of the inner code and states sj and s m of the outer code, respectively. Each hyperbranch is thus an 
equivalent SMCBC labeled with an IOWEF that can be evaluated as explained in the previous subsection. From the 
hypertrellis, the upper bound to the bit-error probability can be obtained through the standard transfer function technique 
employed for convolutional codes. 
1 2. Design of Serially Multiple Concatenated Codes 

For practical applications, SMCCCs are to be preferred to SMCBCs. One reason is that maximum a posteriori 
algorithms are less complex for convolutional than for block codes: a second is that the interleaver gain can be greater for 
convolutional CCs, provided they are suitably designed. Hence, we deal mainly with the design of SMCCCs, extending our 
conclusions to SMCBCs when appropriate. 

Consider the SMCCC depicted in Figure 16. Its performance can be approximated by that of an equivalent block code 
whose IOWEF labels the branch of the hypertrellis joining the zero states of the outer and inner codes. Denoting by 
A Cs (*\ H) the CWEF of this equivalent block code, we can rewrite the upper bound. Equation (3), as (subscript m will 
denote minimum, and a subscript A/ will denote maximum) 
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Pb(e) £ 



Z A c 'fw 
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= Z Z 



A\& e No 



where is the minimum weight of an input sequence generating an error event of the outer code, and h m is the minimum 
weight (since the input sequences of the inner code are not unconstrained independent identically distributed (i.i.d.) binary 
sequences but, instead, codewords of the outer code, h m can be greater than the inner code free distance, tf/ ) of the codewords 
of Cs. By "error event of a convolutional code" we mean a sequence diverging from the zero state at time zero and remerging 
into the zero state at some discrete time/ > 0. For constituent block codes, an error event is simply a codeword. 

The coefficients A^fo of the equivalent block code can be obtained from Equation (4) once the quantities A^?t 
Afy of the CCs are known. To evaluate them, consider a rate R-p/n convolutional code C with memory v, and its equivalent 

(N/R , N - pv) block code whose codewords are all sequences of length N/R bits of the convolutional code starting from and 
ending at the zero state. By definition, the codewords of the equivalent block code are concatenations of error events of the 
convolutional codes. Let 

A(!.HJ) = Z A lt Kj H h (ID 

h 

be the weight enumerating function of sequences of the convolutional code that concatenate / error events with total input 
weight / (see Figure 17), where A ihj is the number of sequences of weight h y input weight /, and number of concatenated error 
events j. For N much larger than the memory of the convolutional code, the coefficient Aih of the equivalent block code can be 
approximated (this assumption permits neglecting the length of error events compared to N , which also assumes that the 

number of ways / input sequences producing/ error events can be arranged in a register of length is 



P 

V JJ 



The ratio N/p 



derives from the fact that the code has rate p/n, and thus N bits corresponds to N/p input words or, equivalently, trellis steps) by 



z 



p 



A ihj 



(12) 



where um , the largest number of error events concatenated in a codeword of weight h and generated by a weight / input 
sequence, is a function of// and / that depends on the encoder. Let us return now to the block code equivalent to the SMCCC. 
Using the previous result of Equation (12) with / = «' for the inner code, and the analogous one, y=//° , for the outer code 
(superscripts o and i will refer to quantities r>ertaining to outer and inner code, respectively), 



A% ~ Z 



P 



(13) 



and substituting them into Equation (4), we obtain the coefficient A^'h °^ me se nally concatenated block code equivalent to 
the SMCCC in the form 



S n% ft' At 

A$s, -EES 









p 




p 






I n'J 



■i° 4' 



(14) 
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where <f f is the free distance of the outer code. By free distance d f wc mean the minimum Hamming weight of error events for 
convolutional CCs and the niinimum Haniming weight of codewords for block CCs. We are interested in large interleaver 
lengths and thus use for the binomial coefficient the asymptotic approximation 



V J ~ n! 



Substitution of this approximation in Equation (14) yields 



A c w % ~ z z z 



P n ! n ! K 



Finally, substituting Equation (15) into Equation (10) gives the bit-error probability bound in the form 
P b (e)~z YeX^nrJ Z Z ZZ 

h =h m w=*%, i=dj „<>=/ n > W 

Using Expression (16) as the starting point, we will obtain some important design considerations. The bound, 
Expression (16), to the bit-error probability is obtained by adding terms of the first summation with respect to the SMCCC 
weights h. The coefficients of the exponential in h depend, among other parameters, on N. For large N , and for a given h, the 
dominant coefficient of the exponential in h is the one for which the exponent of N is maximum. Define this maximum 
exponent as 



a(h) 



max { n° + n* - 1 - 1 ) (17) 



Evaluating a(7t) in general is not possible without specifying the CCs. Thus, we will consider two important cases for 
which general expressions can be found. 
1.2.1. The Exponent of //for the Minimum Weight 

For large values of Et/No , the performance of the SMCCC is dominated by the first term of the summation in h, 

corresponding to the minimum value h = h m . Remembering that, by definition, n ' M and n ° M are the maximum number of 

concatenated error events in codewords of the inner and outer code of weights h m and /, respectively, the following inequalities 
hold true: 



d °f. 



(18) 



(19) 



and 



a(h m ) < max 







/ 








lm(hm) 


K/J 










L d °f J 



lm(h m ) - I (20) 



where /„ (h m ) is the minimum weight / of codewords of the outer code yielding a codeword of weight h m of the inner code, and 
Z^Vmeans 'integer part of x"(floor value) In most cases, l m (h m ) < 2d} and h m < 2^, so that „j w = n ° M = 1 and 
Equation (20) becomes 

<*(h m ) - I - l m (hm) < 1 - d° f (21) 
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The result, Equation (21), shows that the exponent of //corresponding to the minimum weight of SMCCC codewords 
is always negative for 2 <, d° f , thus yielding an interleaver gain at high Et/N 0 . Substitution of the exponent a (h„d into 
Expression (16) truncated to the first term of the summation in h yields 

Km P b (e)_ * B m N 1 *' e\ tsJ 5f L ) 



It 

N 0 



(22) 



where the constant B m is 



0 _ ^Uh m ).hn,jf lm(h m )]l 

^ Tp W W -l»< km)' I 



w&V m 

and W m is the set of input weights w that generates codewords of the outer code with weight /„ (h m ). Expression (22) suggests 
the following conclusions: 

1 . For the values otEi/No and N where the SMCCC performance is dominated by its free distance df' = h m , increasing 
1 0 the interleaver length yields a gain in performance. 

2. To increase the interleaver gain, one should choose an outer code with a large d°f . 

3. To improve the performance with Et/No, one should choose an inner and outer code combination such that h m is large. 
These conclusions do not depend on the structure of the CCs, and thus they apply for both recursive and nonrecursive 

encoders. 

1 5 However, for a given Ei/No> there seems to be a minimum value of N that forces the bound to diverge. In other words, 

there seem to be coefficients of the exponents in h, for h > h my that increase with N . To investigate this phenomenon, we will 
evaluate the largest exponent of N % defined as 

a M = max{ a(h) } = max{ n ° + n* - f - 1 } (23) 

This exponent will permit one to find the dominant contribution to the bit-error probability for N-+oa 
20 1.2.2. The Maximum Exponent of N 

We need to treat the cases of nonrecursive and recursive inner encoders separately. As we will see, nonrecursive 
encoders and block encoders show the same behavior. 
1.2,2. 1 . Block and Nonrecursive Convolutional Inner Encoders. 

Consider the inner code and its impact on the exponent of in Equation (23). For a nonrecursive inner encoder, we 
have n \t = / - In fact, every input sequence with weight 1 generates a finite-weight error event, so that an input sequence with 
weight / will generate, at most, / error events corresponding to the concatenation of / error events of input weight /. Since the 
uniform interleaver generates all possible permutations of its input sequences, this event will certainly occur. Thus, from 
Equation (23) we have 

aM * n%f - I £ 0 

30 and interleaving gain is not allowed. This conclusion holds true for both SMCCC employing a nonrecursive inner encoder and 
for all SMCBCs, since block codes have codewords corresponding to input words with weight equal to /. For those SMCCCs, 
we always have, for some /*, coefficients of the exponential in h of Expression (16) that increase with N, and this explains the 
divergence of the bound arising, for each E h -N 0 , when the coefficients increasing with .V become dominant. 



12 



WO 00/07323 



PCT/US99/17369 



1.2.2.2. Recursive Inner Encoders. 

For recursive convolutional encoders, the minimum weight of input sequences generating error events is 2. As a 

consequence, an input sequence of weight / can generate at most ^~ j error events. 

Assuming that the inner encoder of the SMCCC is recursive, the maximum exponent of N in Equation (23 ) becomes 

/+7 



(24) 



The maximization involves / and u>, since n ° M depends on both quantities. In fact, remembering the definition of 
n%t as me maximum number of concatenated error events of codewords of the outer code with weight / generated by input 
words of weight w, it is straightforward, as in Equation ( 1 9), to obtain 



d° f 



(25) 



Substituting now the last inequality, Equation (25), into Equation (24) yields 

To perform the maximization of the right-hand side (RHS) of Expression (26), consider first the case of 



cim = max 



where q is an integer, so that 



ccm ^ max 



/ = 






qd}+i 


q - 


2 



- 1 



(27) 



The RHS of Expression (27) is maximized, for 2 < d° f , by choosing q = I. On the other hand, for 

qd° f 2£ / < (q + J)d° f 

the most favorable case is / = qd° f , which leads us again to the previously discussed situation. Thus, the maximization 
requires I = d° f . For this value, on the other hand, we have, from Equation (25), n ° M < L and the inequality becomes an 
equality if weW f , where W f is the set of input weights w that generates codewords of the outer code with weight / - d° f . In 
conclusion, the largest exponent of n is given by 



(28) 



The value of om in Equation (28) shows that the exponents of ,V in Expression (16) are always negative integers. 
Thus, for all //, the coefficients of the exponents in // decrease with N t and we always have an interleaver gain. Denoting by 
d'f.eff me minimum weight of codewords of the inner code generated by weight-2 input sequences, we obtain a different 

weight h(ctM> for even and odd values of d° f . For even d/ , the weight h(aAt) associated to the highest exponent of N is given 
by 
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Since it is the weight of an inner codeword that concatenates d° f 12 error events with weight d' fj , ff . Substituting the 
exponent om into Expression ( 16), approximated by only the term of the summation in h corresponding to h =h(au), yields 



lim P b (e) 



(29a) 



where 



d°f 



2 )! k P2 (~~)^ 



(29b) 



In Equation (29b), w^/is the maximum input weight yielding outer codewords with weight equal to d° f , and N° r is 
the number of such codewords. 

For d°f odd, the value of h(cxM) is given by 



h(a M ) 



(30) 



where ^is the minimum weight of sequences of the inner code generated by a weight-3 input sequence. In this case, in fact, 
d° f - / 

we have n ' M concatenated error events, of which n ' M - / are generated by weight-2 input sequences and one is 

generated by a weight-3 input sequence. 

Thus, substituting the exponent om into Expression (16) approximated by keeping only the term of the summation in 
h corresponding to/; = h(cc^) yields 

P„(e) ~Z B^N^e- ♦ * ]* *) (31) 



lim 



where 



Bpdd ~ 



dV 



kp~ 



<d°,-n 



(d° f -3) 



k P - 



<f> f -l 



(d° r S) 



(32) 



In cases of d° f both even and odd, we can draw from Expressions (29) and (31) a few important design considerations, as 
follows: 

(1) In contrast with the case of block codes and nonrecursive convolutional inner encoders, the use of a recursive 
convolutional inner encoder always yields an interleaver gain. As a consequence, the first design rule states that the 
inner encoder must be a convolutional recursive encoder. 

(2) The coefficient h(a^) that multiplies the signaMo-noise ratio Ei/Nq in Expression (16) increases for increasing values 
of d f4ff ■ Thus * we deduce tf 131 the effective free distance of the inner code must be maximized. Both this and the 
previous design rule also had been stated for PMCCCs, As a consequence, the recursive convolutional encoders 
optimized for use in PMCCCs can be employed altogether as inner CC in SMCCCs. When d/ is odd, for special 
cases it is possible to increase H(om) and h m further by choosing the feedback polynomial of the inner code to have a 
factor (I +£>;, yielding h { $ = °s Note that there are other feedback polynomials such as (J + D + D 2 + D 3 + D 4 ) or 
(I + D + D 2 + D 3 + D 4 + D s - D 6 ) yielding h$ = *t 
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(3) The interleaver gain is equal to hf for even values of d° f and to N~^T~ for odd values of d° f As a 

consequence, we should choose, compatibly with the desired rate R c of the SMCCC, an outer code with a large and, 
possibly, an odd value of the free distance. 

(4) As to other outer code parameters, N° f and w M/ should be mmimized. In other words, we should have the minimum 
number of input sequences generating free distance error events of the outer code, and their input weights should be 
rniriimized. Since nonrecursive encoders have error events with w = 1 and, in general, less input errors associated 
with error events at free distance, it can be convenient to choose as an outer code a nonrecursive encoder with 
minimum M°f and w>a//. 

1.2.2.3. Examples Confirming the Design Rules 

To confirm the design rules obtained asymptotically,(i.e., for large signal-to-noise ratios and large interleaver lengths 
hi) we evaluate the upper bound, Expression (16), to the bit-error probability for several block and convolutional SMCCs with 
different interleaver lengths, and compare their performances with those predicted by the design guidelines. 
1.2.2.3.1. Serially Multiple Concatenated Block Codes. 

We consider three different SMCBCs obtained as follows: 

a) The first is the (7m, 3m, N) SMCBC; 

b) The second is a (15m, 4m, N) SMCBC using as outer code a (5, 4) parity-check code and as inner code a (15, 5) 
Bose-Chaudhuri-Hocquenghem (BCH) code; 

c) The third is a (15m, 4m t N) SMCBC using as outer code a (7, 4) Hamming code and as inner code a (15, 7) BCH 

code. 

Note that the second and third SMCBCs have the same rate, 4/15. The outer, inner, and SMCBC code parameters 
introduced in the design analysis are listed in Table 2. 

In Figures 18, 1 9 and 20, we plot the bit-error probability bounds for SMCBCs 1 , 2 and 3 of Table 2. Code SMCBC 1 
has d° f « 2\ thus, from Equation (21 ), we expect an interleaver gain going as N-L This is continued by the curves of Figure 18, 
which, for a fixed and sufficiently large signal-to-noise ratio, show a decrease in P b (e) of a factor of 10 when N passes from 4 
to 40, from 40 to 400, and from 400 to 4000. Moreover, from Expression (22), we expect, in each curve for In P b (e), a slope 
with Et/hlo as -h„R , . From Table 2, we know that R c =5/7, -h„ - J, so that P b (e) should decrease by a factor of e hmRe =3.6 
when the signal-to-noise ratio increases by / (not in dB). This behavior fully agrees with the curves of Figure 18. Finally, the 
curves of Figure 1 8 show a divergence of the bound at lower Et/N 0 for increasing hi. This is due to coefficients of terms with // 
> h m in Expression (16) that increase with Wand whose influence becomes more important for larger N. 



Table 2. D esign parameters of CCs and SMCBCs for three serially concatenated block codes- 
Outer code Inner code SMCBC 



Code 


Code type 


w° m 


d° f 


Code type 




I>,f 


diteff 


h m 


oOlm) 


SMCBC 1 


Parity check (4,3) 


1 


2 


Hamming (7,4) 


1 


3 


3 


3 


-I 


SMCBC2 


Parity check (5,4) 


1 


2 


BCH (15,5) 


1 


7 


7 


7 


-1 


SMCBC3 


Hamming (7,4) 


1 


3 


BCH (15,7) 


I 


5 


5 


5 


-2 



Code SMCBC2 hascf/= thus, from Equation (21), we expect the same interleaver gain as for SMCBC I, i.e.. A'-/. 
This is confirmed by the curves of Figure 19. This code, however, has a larger minimum distance h m = 7, and a rate R r =4/15. 
Thus, we expect a steeper descent of P b (e) with Et/N 0 - More precisely, we expect a decrease by a factor of 6.5 when the 
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signal-to-noise ratio increases by /. This, too, is confirmed by the curves, which also show the bound divergence predicted in 
our analysis. 

Code SMCBC3 has d° f =3\ thus, from Equation (21), we expect a larger interleaver gain than for SMCBC1 and 
SMCBC2, i.e., N - 2. This is confirmed by the curves of Figure 20. which, for a fixed and sufficiently large signal-to-noise 
ratio, show a decrease in P b (e) of a factor of 100 when N passes from 7 to 70, from 70 to 700, and from 700 to 7000. This code 
has a nunirnum distance h m =5 and a rate R c =4/15, which means a descent of P b (e) with Et/N 0 by a factor of 3.8 when the 
signal-to-noise ratio increases by /. This, too, is confirmed by the curves. As to the bound divergence, we notice a slightly 
different behavior with respect to previous cases. The curve with N = 7000, in fact, denotes a strong influence of coefficients 
increasing with N for Et/No lower than 7. 
1,2.2.3.2. Serially Multiple Concatenated Convolutional Codes. 

We consider four different SMCCCs obtained as follows: The first, SMCCC 1, is a (3.1 t N) SMCCC, using as outer 
code a four-state (2,1) recursive, systematic convolutional encoder and as inner code a four-state (3,2) recursive, systematic 
convolutional encoder. The second, SMCCC2, is a (3,1,N) SMCCC, using as outer code the same four-state (2,1) recursive, 
systematic convolutional encoder as SMCCC 1, and as inner code a four-state (3,2) nonrecursive convolutional encoder. The 
third, SMCCC3, is a (3,1,N) SMCCC. using as outer code a four-state (2,1) nonrecursive, convolutional encoder, and as inner 
code the same four-state (3,2) recursive, systematic convolutional encoder as SMCCC 1. Finally, the fourth, SMCCC4, is a 
(6,2,N) SMCCC using as outer code a four-state (3,2) nonrecursive convolutional encoder, and as inner code a four-state (6,3) 
recursive, systematic convolutional encoder obtained by using three times the four-state (2,1) recursive, systematic 
convolutional encoders in Table 2, 

The outer, inner, and SMCCC code parameters introduced in the design analysis are listed in Table 3. In this table, 
the CCs are identified through the descriptions of Table 2. In Figures 21, 22, 23, and 24, we plot the bit-error probability 
bounds for SMCCCs 1,2,3, and 4 of Table 3, with input information block lengths R° C N = 100. 200, 300, 400, 500, and 1000. 

Consider first the SMCCCs employing as inner CCs recursive, convolutional encoders. They are SMCCC 1, 
SMCCC3, and SMCCC4. Code SMCCC 1 has cF/= 5; thus, from Expression (31), we expect an interleaver gain behaving as N 
- 3, This is fully continued by the curves of Figure 21, which, for a fixed and sufficiently large signal-to-noise ratio, show a 
decrease in P b (e) of a factor of 1000 when A/ passes from 200 to 2000. For an even more accurate confirmation, one can 
compare the interleaver gain for every pair of curves in the Figure 21. Moreover, from Expression (31), we expect in each 
curve for In P b (e) a slope with Et/No as -h(aM )R C . From Table 3, we know that R e =l/3 and h(aM ;=7,so that P b (e) should 
decrease by a factor of 10.3 when the signal-to-noise ratio increases by /. This behavior fully agrees with the curves of Figure 
2 1 . Finally, the curves of Figure 2 1 do not show a divergence of the bound at lower E//N 0 for increasing M This is due to the 
choice of a recursive encoder for the inner code, which guarantees that all coefficients a(h) decrease with N. 



Table 3, Design parameters of MCCs and SMCCCs for four SMCCCs. Outer code Inner code SMCCC 



Code 


Code type 


w° m 


d° f 


Code type 








h m 


oc(h m ) 


h(ct M ) 


CtM 


SMCCC 
1 


Rate 1/2 recursive 


2 


5 


Rate 2/3 recursive 


2 


3 


4 


5 


-A 


7 


-3 


SMCCC 
2 


Rate 1/2 recursive 


2 


5 


Rate 2/3 
nonrecursive 


1 


3 


4 


5 


-A 






SMCCC 
3 


Rate 1/2 
nonrecursive 


1 


5 


Rate 2/3 recursive 


2 


3 


4 


5 


-A 


7 


-3 


SMCCC 
4 


Rate 2/3 
nonrecursive 


1 


3 


Rate 1/2 recursive 


2 


5 


6 


5 


-2 


5 


-2 
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Code SMCCC3 differs from SMCCC 1 only in the choice of a nonrecursive outer encoder, which is a four-state 
encoder (see Tables 2 and 3) with the same it/ as for SMCCC 1, but with w° m = 1 instead of w° m = 2. 

From the design conclusions, we expect a slightly better behavior from this SMCCC. This is confirmed by the 
performance curves of Figure 23, which present the same interleaver gain and slope as those of SMCCC 1 but have a slightly 
5 lower Pb(e) (the curves for SMCCC3 are translated versions of those of SMCCC 1 by 0. 1 dB). 

Code SMCCC4 employs the same CCs as SMCCC2 but reverses their order. It uses as outer code a rate 2/3 
nonrecursive convolutional encoder, and as inner code, a rate 1/2 recursive convolutional encoder. As a consequence, it has a 
lower <Ff = 3 and a higher Om - - 2. Thus, from Expression (31), we expect a lower interleaver gain than for SMCCC 1 and 
SMCCC3 as N - 2. This is confirmed by the curves of Figure 24 , which, for a fixed and sufficiently large signal-to-noise ratio, 
1 0 show a decrease in Pb(e) of a factor of 100 when N passes from 150 to 1500. As to the slope with Et/No, this code has the same 
•/\(ccm)R c as SMCCC I and SMCCC3 and, thus, the same slope. On the whole, SMCCC4 loses more than 2 dB in coding gain 
with respect to SMCCC 3. This result confirms the design rule suggesting the choice of an outer code with d°/ as large as 
possible. 

Finally, let us consider code SMCCC2, which differs from SMCCC 1 in the choice of a nonrecursive inner encoder, 
1 5 with the same parameters but with the crucial difference of W m = 1, Its bit-error probability curves are shown in Figure 22. We 
see, in fact, that for low signal-to-noise ratios, say below 3, no interleaver gain is obtained. This is because the performance is 
dominated by the exponent H(om )> whose coefficient increases with N . On the other hand, for larger signal-to-noise ratios, 
where the dominant contribution to Pb (e) is the exponent with the lowest value of h m , the interleaver gain makes its 
appearance. From Expression (22), we foresee a gain as N - 4, meaning four orders of magnitude for N passing from 100 to 
20 1000. Curves in Figure 22 show a smaller gain (slightly higher than 1/1000), which is, on the other hand, rapidly increasing 
with Ei/No. 

1 .3. Parallel Multiple Concatenated Convolutional Codes 

The concept of Parallel Multiple Concatenated Convolutional Code (PMCCC) utilizes a soft-output decoding and an 
iterative decoding. We present two versions of a simplified maximum a posteriori (MAP) decoding algorithm. The algorithms 

25 work in a sliding window form (like the Viterbi algorithm) and can thus be used to decode continuously transmitted sequences 
obtained by PMCCC, without requiring code trellis termination. A heuristic explanation is also given of how to embed the 
maximum a posteriori algorithms into the iterative decoding of PMCCC. The performances of the two algorithms are compared 
on the basis of a powerful rate 1/3 PMCCC. Basic circuits to implement the simplified a posteriori decoding algorithm using 
lookup tables, and two further approximations (linear and threshold), with a very small penalty, to eliminate the need for 

30 lookup tables are proposed. 

The broad framework of this analysis encompasses digital transmission systems where the received signal is a 
sequence of waveforms whose correlation extends well beyond T, the signaling period. There can be many reasons for this 
correlation, such as coding, intersymbol interference (ISI), or correlated fading. The optimum receiver in such situations cannot 
perform its decisions on a symbol -by-symbol basis, so that deciding on a particular information symbol w* involves processing a 

35 portion of the received signal Tj seconds long, with Tj>T. The decision rule can be either optimum with respect to a sequence 
of symbols, w£ = (uk*Uk+h*~iUk+n-i) » or with respect to the individual symbol, h* . 

The most widely applied algorithm for the first kind of decision rule is the Viterbi algorithm. In its optimum 
formulation, it would require waiting for decisions until the whole sequence have been received, hi practical implementations, 
this drawback is overcome by anticipating decisions (single or in batches) on a regular basis with a fixed delay, D. Choice of D 

40 as live to six times the memory of the received data is widely recognized as a good compromise between performance, 
complexity, and decision delay. 
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Optimum symbol decision algorithms must base their decisions on the maximum a posteriori (MAP) probability. 
They have been known since the early seventies, although much less popular than the Viterbi algorithm and almost never 
applied in practical systems. There is a very good reason tor this neglect in that they yield performance in terms of symbol error 
probability only slightly superior to the Viterbi algorithm, yet they present a much higher conceptual complexity. Only recently, 
5 the interest in these algorithms has seen a revival in connection with the problem of decoding concatenated coding schemes. 
Concatenated coding schemes (a class in which we include product codes, multilevel codes, generalized concatenated codes, 
and serial and parallel concatenated codes) were proposed as a means of achieving large coding gains by combining two or 
more relatively simple "constituent" codes. The resulting concatenated coding scheme is a powerful code endowed with a 
structure that permits an easy decoding, like "stage decoding" or "iterated stage decoding". 

10 To work properly, all these decoding algorithms cannot limit themselves to passing the symbols decoded by the inner 

decoder to the outer decoder. They need to exchange some kind of soft information . The optimum output of the inner decoder 
should be in the form of the sequence of the probability distributions over the inner code alphabet conditioned on the received 
signal, the a posteriori probability (APP) distribution. There have been several attempts to achieve, or at least to approach, this 
goal. Some of them are based on modifications of the Viterbi algorithm so as to obtain, at the decoder output, in addition to the 

1 5 "hard' '-decoded symbols, some reliability information. This has led to the concept of "augmented-outpuf * or the list-decoding 
Viterbi algorithm, and to the soft-output Viterbi algorithm (SOVA). These solutions are clearly sub-optimal, as they are unable 
to supply the required APP. A different approach consisted in revisiting the original symbol MAP decoding algorithms with the 
aim of simplifying them to a form suitable for implementation. Figure 25 shows a PMCCC whose encoder is formed by two (or 
more) constituent systematic encoders joined through an interleaver. The input information bits feed the first encoder and, after 

20 having been interleaved by the interleaver, enter the second encoder. The codeword of the PMCCC comprises of the input bits 
to the first encoder followed by the parity check bits of both encoders. Generalizations to more than one interleaver are possible 
and fruitful. 

The sub-optimal iterative decoder is modular and comprises of a number of equal component blocks formed by 
concatenating soft decoders of the constituent codes (CC) separated by the interleaves used at the encoder side. By increasing 
25 the number of decoding modules and, thus, the number of decoding iterations, bit-error probabilities as low as 10* 5 at E(/N Q 
=0.0 dB for rate 1/4 PMCCC have been shown by simulation. 

We will describe two versions of a simplified MAP decoding algorithm that can be used as building blocks of the 
iterative decoder to decode PMCCCs. A distinctive feature of the algorithms is that they work in a "sliding window* 1 form, like 
the Viterbi algorithm, and thus can be used to decode "continuously transmitted" PMCCCs, without requiring trellis 
30 termination and a block-equivalent structure of the code. 

The final aim is to find suitable soft-output decoding algorithms for iterated staged decoding of PMCCC employed m 
a continuous transmission. 

We will refer to the transmission system of Figure 26. The information sequence w, composed of symbols drawn from 
an alphabet U~{ uj u M } and emitted by the source, enter an encoder that generates code sequences c. Both source and code 

35 sequences are defined over a time index set K (a finite or infinite set of integers). Denoting the code alphabet C-{ci cm } , 

the code C can be written as a subset of the Cartesian product of C by itself K times, i.e., C K . The code symbols Ck (the 
index k will refer to time) enter the modulator, which performs a one-to-one mapping of them with its signals, or channel input 

symbols x t , belonging to the set X={x } x M }, 

The channel symbols Xk are transmitted over a stationary inemoryless channel with output symbols yjt . The channel is 

40 characterized by the transitions probability distribution (discrete or continuous, according to the channel model) P (y\x). The 
channel output sequence is fed to the symbol -by-symbol soft-output demodulator, which produces a sequence of probability 
distributions y% (c) over C conditioned on the received signal, according to the memoryless transformation. 
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Y k (c) = P(x k = x(c),y k ) = P(y k \ Xk = x(c)) P k -(c) = y k (x) (33) 

where we have assumed to know the sequence of the a priori probability distributions of the channel input symbols (P k 
(x):k eK) and made use of the one-to-one mapping C -> X. 

The sequence of probability distributions y k (c) obtained by the modulator on a symbol-by-symbol basis is then 
supplied to the soft-output symbol decoder, which processes the distributions in order to obtain the probability distributions P k 
(u\y). They are defined as 

PkMy) = P(uic = u\y) (34) 

The probability distributions P k (u\ y) are referred to in the literature as symbol-by-symbol a posteriori probabilities 
(APP) and represent the optimum symbol-by-symbol soft output. 

From here on, we will limit ourselves to the case of time-invariant convolutional codes with N states, use the 
following notations with reference to Figure 27, and assume that the (integer) time instant we are interested in is the k*: 

(1) Si is the generic state at time k, belonging to the set S = fS i , -;S N } . 

(2) S7 (u 1 is one of the precursors of S, , and precisely the one defined by the information symbol « * emitted during the 
transition S7 ( « * ) ~>Sf 

(3) ST (") is one of the successors of S, , and precisely the one defined by the information symbol u emitted during the 
transition Si -+ST («) . 

(4) To each transition in the trellis, a signal x is associated, which depends on the state from which the transition originates 
and on the information symbol u detenriining that transition. When necessary, we will make this dependence explicit by 
writing x(u \ Si ) when the transition ends in Si and x(Si ,u) when the transition originates from Si . 

1 .3. 1 The BCJR Algorithm 

The BCJR is the optimum algorithm to produce the sequence of APR We consider first the original version of the 

algorithm, which applies to the case of a finite index set K = (1 n) and requires the knowledge of the whole received 

sequence >> =( y It ...,y n ) to work. In the following, the notations u. c, x, and v will refer to sequences n-symbols long, and the 
integer time variable k will assume the values /, As for the previous assumption, the encoder admits a trellis 

representation with N states, so that the code sequences c (and the corresponding transmitted signal sequences x) can be 
represented as paths in the trellis and uniquely associated with a state sequence s = (s 0 ,... t s„) whose first and last states, s 0 and 
sn , are assumed to be known by the decoder. 

Defining the a posteriori transition probabilities from state Si at time k as 

<Jk(Si , u) = P( Uk = u . sk.i - S, | y) (35) 
The APP P (u\y) we want to compute can be obtained as 

Pk(u\y) = Zak (S it u) (36) 

Si 

Thus, the problem of evaluating the APP is equivalent to that of obtaining the a posteriori transition probabilities 
defined in Equation (35). The APP can be computed as 

a k (S t ,u) = h a ak-i(Si) T k (x(S l . u)) 0 k (SJ(u)) (37) 

where: 

• h a is such that £ & k .(S, ,u) - 7 

9 'A (x( S, m)) are the joint probabilities already defined in Equation (33), i.e., 

y k (x) = P(y k ,xk = x) = P(y k \ Xk = x) P( Xt = xj (38) 
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The fs can be calculated from the knowledge of the a priori probabilities of the channel input symbols jr and of the 
transition probabilities of the channel P (y k \x k = x ). For each time k y there are M different values of y to be computed, 
which are then associated to the trellis transitions to form a sort of branch metrics. This information is provided by 
the symbol-by-symbol soft-output demodulator. 

Ok (Si J are the probabilities of the states of the trellis at time k conditioned on the past received signals, namely, 

at(Si) = P(s k =Si\ y)) (39) 
where y\ denotes the sequence >>/..... v*. They can be obtained by the forward recursion: 
a k (Si) = h a Xat-i (Sl(u)) y k (x(u t Si)) (40) 



with ha a constant determined through the constraint 
and where the recursion is initialized as 



/el J 1 ' lfSi = S °\ 

ao(Si) = < V (41) 

[ 0 otherwise) 

• fi k (Si ) are the probabilities of the trellis states at time k conditioned on the future received signals P(sic= Si ly k+l )■ 
They can be obtained by the backward recursion 

P k (Si) = hp X 0 k+l (SlM) r k+1 (x(S it u)) (42) 

u 

with hp a constant obtainable through the constraint 

Z P k (S0 - 7 

Si 

and where the recursion is initialized as 

PJSi) = \ \ (43) 

[ 0 otherwise J 
We can now formulate the BCJR algorithm by the following steps: 

( 1 ) Initialize « 0 and /3 n according to Equations (4 1 ) and (43). 

(2) As soon as each term y k of the sequence^ is received, the demodulator supplies to the decoder the "branch metrics" 
y k of Equation (38), and the decoder computes the probabilities a. k according to Equation (40). The obtained 
values of an (Si) as well as the y k are stored for all k, $ , , and jr. 

(3) When the entire sequence y has been received, the decoder recursively computes the probabilities p k according to the 
recursion of Equation (42) and uses them together with the stored a's and /Ts to compute the a posteriori transition 
probabilities a k (Si* u) according to Equation (37) and, finally, the APP P k (u]y) from Equation (36). 

1 .3.2 The Sliding Window BCJR (SW-BCJR) 

The BCJR algorithm requires that the whole sequence have been received before starting the decoding process. In 
this aspect, it is similar to the Viterbi algorithm in its optimum version. To apply it in a PMCCC, we need to subdivide the 
information sequence into blocks, decode them by terminating the trellises of both CCs, and then decode the received sequence 
block by block. Beyond the rigidity, this solution also reduces the overall code rate. 
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A more flexible decoding strategy is offered by a modification of the BCJR algorithm in which the decoder operates 
on a fixed memory span, and decisions are forced with a given delay D. We call this new, and sub-optimal, algorithm the 
sliding window BCJR (SW-BCJR) algorithm. We will describe two versions of the sliding window BCJR algorithm that differ 
in the way they overcome the problem of initializing the backward recursion without having to wait for the entire sequence. We 
5 will describe the two algorithms using the previous step description suitably modified. Of the previous assumptions, we retain 
only that of the knowledge of the initial state so , and thus assume the transmission of semi-infinite code sequences, where the 
time span K ranges from 1 to oo. 

1 .3 .24 . The First Version of the Sliding Window BCJR Algorithm (SW1 -BCJR ) 
Here are the steps; 

10 ( 1 ) Initialize ao according to Equation (4 1 ). 

(2) Forward recursion at time k: Upon receiving y k , the demodulator supplies to the decoder the M distinct branch 
metrics, and the decoder computes the probabilities a * (Si) according to Equations (38) and (40). The obtained 
values of Qk (Si) are stored for all Si , as well as the ctk M • 

(3) Initialization of the backward recursion (k>D): 

15 Pk(Sj) = aufSj) V5 ; (44) 

(4) Backward recursion: It is performed according to Equation (42) from time k - I back to time k -D. 

(5) The a posteriori transition probabilities at time k -D are computed according to 

a k . D (Si>u) = ha a k -D-i(Si) Y^ D (x(Si.u)) p k , D (S!(u)) (45) 

(6) The APP at time k -D is computed as 
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Pk-D("\y) « ILcrk-D(Si.u) (46) 
s, 



For a convolutional code with parameters (k 0 ,na). number of states N , and cardinality of the code alphabet, 
M = 2"° , the SW1-BCJR algorithm requires storage of N *D values of ct's and M xD values of the probabilities y k (x) 
generated by the soft demodulator. Moreover, to update the ex's and P's for each time instant, the algorithm needs to perform 
Mx2 ko multiplications and N additions of 2 ko numbers. To output the set of APP at each time instant, we need a D-times 
25 long backward recursion. Thus, the computational complexity requires overall (D+l) M x 2 ko multiplications and (D +1) M 
additions of 2 ko numbers each 

As a comparison, the Viterbi algorithm would require, in the same situation, Mx2 k ° additions and A/x2 ko -way 
comparisons, plus the trace-back operations, to get the decoded bits. 

1.3.2.2 The S econd. Simplified Version of the Sliding Window BCJR Algorithm (SW2-BCJR^ 
30 A simplification of the sliding window BCJR that significantly reduces the memory requirements comprises of the 

following steps: 

(1) Initialize ao according to Equation (4 1 ). 

(2) Forward recursion (k > D): If k> A the probabilities a k ^ D _ i (S 1 J are computed according to Equation (40). 

(3) Initialization of the backward recursion (k > D): 



Pk(Sj) = jj V Sj (47) 

(4) Backward recursion (k > D): It is performed according to Equation (42) from time k-J back to time k~D. 

(5) The a posteriori transition probabilities at time k - D are computed according to 
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a k , D (Si.u) = ha a k -D-i(Si) Y k _ D (x(Si>u)) p k _ D (S7(u)) (48) 
(6) The APP at time k - D is computed as 

Pk-oMy) = Xcrk-D(Si,u) (49) 

■Si 

This version of the sliding window BCJR algorithm does not require storage of the N x d values of a's as they are 
updated with a delay of D steps. As a consequence, only N values of a's and K4*D values of the probabilities y k (x) generated 
by the soft demodulator must be stored. The computational complexity is the same as the previous version of the algorithm. 
However, since the initialization of the ft recursion is less accurate, a larger value of D should be set in order to obtain the same 
accuracy on the output values P k . D (u\y). This observation will receive quantitative evidence in the section devoted to simulation 
results. 

1.3.3 Additive Algorithms 
1.3.3.1 The Log-BCJR 

The BCJR algorithm and its sliding window versions have been stated in multiplicative form. Owing to the 
monotonicity of the logarithm function, they can be converted into an additive form by conversion to logarithms. Let us define 
the following logarithmic quantities: 

T k (x) = log frfxjj 

Ak(Si) = log [ ak (Si)] 
B k (Si) = log fft k ( Si J J 
Zk(Si>u) = log [a k (Si.u)J 
These definitions lead to the following A and B recursions, derived from Equations (40), (42), and (37): 

(50) 



At (Si J = log |^ Ze( Ak -' (sj(u)) + J + //. 

Bk(Si) = log [ xJrt./siM) J + He (51) 



Zk(Si.u) = Ak-i(Si) + Tk(x(Si.u)) + Bk(StM) + Hz (52) 
with the following initializations: 

AoCSJ - i 1 ' /S - = *) 
-qo otherwise) 



B,<S,J - { ' « " *} 

[ -oo otherwise) 



1.3.3.2 Simplified Versions of the Log-BCJR 

The problem in the recursions defined for the log-BCJR comprises of the evaluation of the logarithm of a sum of 
exponential; 



log 



[? 



An accurate estimate of this expression can be obtained by extracting the term with the highest exponential, 

Am = max { At } 

so that 
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log £ e Al = Am - log / + Z e ^ ' Am) (53) 

L i J V A,*A M J 

and by computing the second term of the right-hand side (RHS) of Equation (53) using lookup tables. 

However, when Am >> A , , the second term can be neglected. This approximation leads to the additive logarithmic- 
BCJR (AL-BCJR) algorithm: 

A k (Si) = maxf Ak-i(SUu)) + TkMu,Si))\ + H A (54) 

Bk(Si) = maxf Bk+i(S7(u)) + r k+ i(St + H B (55) 

Zk(Si.u) = Ak-i(Si) + TkMSi.u)) + Bk(SlM) + /Ye (56) 
with the same initialization of the log-BCJR. 

Both versions of the SW-BCJR algorithm described can be used, with obvious modifications, to transform the block 
log-BCJR and the AL-BCJR into their sliding window versions, leading to the SW-log-BCJR and the SWAL1-BCJR and 
SWAL2-BCJR algorithms. 

1.3.4. Explicit Algorithms for Some Particular Cases 

In this section, we will make explicit the quantities considered in the previous algorithms' descriptions by making 
assumptions on the code type, modulation format, and channel. 
1 .3.4.1 . Rate 1/n Binary Systematic Convolutional Encoder 

In this section, we particularize the previous equations in the case of a rate 1/n binary systematic encoder associated 
to n binary-pulse amplitude modulation (PAM) signals or binary phase shift keying (PSK) signals. 

The channel symbols x and the output symbols from the encoder can be represented as vectors of n binary 
components: 

o = fa c n ] a e {0.1} 

X « fxi, - .Xn] Xi €E {A t -A} 
Xk - [xkh- -»Xkn] 

5 k - [y ki y k J 

where the notations have been modified to show the vector nature of the symbols. The joint probabilities y k (x) , over a 
memory less channel, can be split as 

Y k (x) = fl P(y km \x knl = x n JP(xk m = Xk) (57) 

m=/ 

Since in this case the encoded symbols are n-tuple of binary symbols, it is useful to redefine the input probabilities, y y 
in terms of the likelihood ratios: 

3 _ p ( y km i x km = a) 

Akm ~ 



P( y km i x km = -A) 

P(x km - -A) 



so that, from Equation (57), 



n ( At, J c " ( A A f" » 

nt") I ^ A km 1 A knt m~l 



where h r takes into account all terms independent of x 
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The BCJR can be restated as follows: 

at(St) = h r h a Z a k -,(S](u)) fl [At* *LJ Cm<u ' Si) (58) 

u nr=l 

p k (so - h Y h P z Passim n <»> 



ak(St.u) = h r ha a*.,(Si) n [ Aflc+Dm A.£+, )m f<"- s '' P k (Sl(u» (60) 



whereas its simplification, the AL-BCJR algorithm, becomes 



At(Si) = max j Ak-i(Si(u» + Z c m ^.^; (A* m + \ + ^ (61) 

^fS^ = max | Bk+i(St(u)) + £ Cm(Si.u) ( A km * A&J \ + //* (62) 

u I m«7 J 

T.k(S t .u) = i4*-;f&>> + £ Cm(Si.u) (A km + A km ) + Bk(SiW) (63) 

where /l stands for the logarithm of the corresponding quantity X 
1.3.4.2 The Additive White Gaussian Noise Channel 

When the channel is an additive white Gaussian noise (AWGN) channel, we obtain the explicit expression of the log- 
likelihood ratios Aid as 



} 



A* = log 



P(y ki \x ki =_A 



log 



2A 



Hence, the AL-BCJR algorithm assumes the following form: 



,2A 



At(Si) = max < A k -,(Sl(u)) + I c m (u.S0 (— y km + AiJ > + H A (64) 



Bk(Si) = max 



,2A 



+ H B (65) 



XkfSi.u) = Ak-i(Si)+ I c m (Si,u)(-^y km + A^J + Bk(Stfu)) 



(66) 



We will consider turbo codes with rate 1/2 component convolutional codes transmitted as binary PAM or binary PSK 
over an AWGN channel. 

1 .3.5. Iterative Decoding of Parallel Concatenated Convolutional Codes 

In this section, we will show how the MAP algorithms previously described can be embedded into the iterative 
decoding procedure of parallel concatenated codes. We will derive the iterative decoding algorithm through suitable 
approximations performed on maximum-likelihood decoding. The description will be based on the fairly general parallel 
concatenated code shown in Figure 28, which employs three encoders and three interleaves (denoted by n in the figure). 
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Let Uk be the binary random variable taking values in {0,1}, representing the sequence of information bits 
u =( Ul optimum decision algorithm on the k jh bit u k is based on the conditional log-likelihood ratio U 

Lk - He*"'-™ 

B P(u k = 0\y) 

Z P(.v\u)UP( Ui ) 

Z PMu)UP( Ul ) S P( Uk =0) 

u. UJt -0 j*k 

Z P(y\x(u))YlP(uj) 

where, in Equation (67), P (uj) are the a priori probabilities. 

If the rate Una constituent code is not equivalent to a punctured rate 1/n * 0 code (in this case the information send is 
the information data and one parity bit, the parity bit sent is different every time) or if turbo trellis-coded modulation is used, 
we can first use the symbol MAP algorithm as described in the previous sections to compute the log-likelihood ratio of a 
symbol u=«/ u^,, given the observation yas 

x w = log 

Pf0\y) 

where 0 corresponds to the all-zero symbol. Then we obtain the log-likelihood ratios of the f bit within the symbol by 

Z e XM 



Uuf) " log 



Z e XM 



In this way, the turbo decoder operates on bits, rather than symbol, interleaving is used. 

To explain the basic decoding concept, we restrict ourselves to three codes, but extension to several codes is 
straightforward. In order to simplify the notation, consider the combination of the interleaver and the constituent encoder 
connected to it as a block code with input u and outputs x, . / =0.1.2.3 (x 0 =u) and the corresponding received sequences as y, . i 
=0,I,2J. The optimum bit decision metric on each bit is (for data with uniform a priori probabilities) 

Z P(y 0 \u) P(y t \u) P(y 2 \ u) P(y 3 \u) 

Lk = log "■ Ui ^ i _ 

S P(y 0 \u)P(y l \u)P(y 2 \u)P(y 3 \u) m 

t*,u t =0 

but, in practice, we cannot compute Equation (68) for large /; because the permutations x 2 .ttj imply that y 2 andyj are no 
longer simple convolution^ encodings of u. Suppose that we evaluate P(y,\u)J=0 t 2.3 in Equation (68) using Bayes' rule and 
using the following approximation: 

P(u\y,) * n p,(u k ) (69) 

k=-l 

Note that Pfu[ Vi ) is not separable in general. However, for / =0, P(u\y 0 ) is separable; hence, Equation (69) holds 
with equality. So we need an algorithm that approximates a nonseparable distribution P (tt\y , ; = P with a separable 

N 

distribution n Pi(uk) = Q- The best approximation can be obtained using the Kullback cross-entropy minimize^ which 

mimmizes the cross-entropy H(Q t P) ~E{log(Q/P )} between the input P and the output O. 

The MAP algorithm approximates a nonseparable distribution with a separable one; however it is not clear how good 
it is compared with the Kullback cross-entropy minirruzer. Here we use the MAP algorithm for such an approximation. In the 
iterative decoding, as the reliability of the {u k } improves, intuitively one expects that the cross-entropy between the input and 
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the output of the MAP algorithm will decrease, so that the approximation will improve. If such an approximation, i.e., Equation 
(69), can be obtained, we can use it in Equation (68) for i =2 and i=3 (by Bayes' rule) to complete the algorithm. 
Define i ik by 

Pi(uk) = * r (70) 

5 where u k e {0.1}. To obtain { p i } or, equivalently { l ik } , we use Equations (69) and (70) for i ^0.2,3 (by Bayes' rule) to 
express Equation (68) as 

Lk = f( y t . L 0 , L 2 ■ Li • k) + to* + Z 2k + L 3k (71) 
where £ 0k - 2Ay 0 k / cf (for binary modulation) and 

/r^ . Lo. Li, L,.k) = log ""!-' . + . , (72) 

u.ufO j*k 

1 0 We can use Equations (69) and (70) again, but this time for i =0JJ, to express Equation (68) as 

Lt =f(y 2 ,L 0 > Lj . L 3 *k) + Zok + L Jk + L 3k (73) 

and similarly, 

Lk = f( y 3 > Lo + L } > L 2 >V + Zok + L tk + L 2k (74) 
A solution to Equations (7 1 ), (73), and (74) is 
15 Lik - f( y t - L 0 > L 2 * L 3 • k ) 

I* - f(y 3 ' L 0 > Li* Ls- Q (75) 
L 3k = f( y 3 > L 0 > Lj , L 2 , k) 
for k =1,2, ;n, provided that a solution to Equation (75) does indeed exist. The final decision is then based on 

Lk = Lok + Lik + L 2k + L 3k (76) 
20 which is passed through a hard limiter with zero threshold. We attempted to solve the nonlinear equations in Equation (75) 
£° T Lt>L 2 > Z 3 by using the iterative procedure: 

ZT I} - aT>f( y, . To . Vf . I?' - k) (77) 

for k =1.2.— ,n y iterating on m. Similar recursions hold for Zu ^ V$k 

We start the recursion with the initial condition i[ 0) = = Zf> = Z 0 ■ For the computation oi'tf), we can 

25 use any MAP algorithm as described in the previous sections, with interleavers (direct and inverse) where needed; call this the 
basic decoder £>,- . / ^1.2,3. The J?£ } , / =1,2,3 represent the extrinsic information. The signal flow graph for extrinsic 
information is shown in Figure 29, which is a fully connected graph without self-loops. Parallel, serial, or hybrid 
implementations can be realized based on the signal flow graph (in this figure y 0 is considered as part of y } ). Based on our 
equations, each node's output is equal to internally generated reliability L minus the sum of all inputs to that node. The BCJR 

30 MAP algorithm always starts and ends at the all-zero state since we always terminate the trellis. We assumed /rj=I identity; 
however, any /ry can be used. 

The overall decoder is composed of block decoders D ( connected in parallel, as in Figure 30 (when the switches are 
in position P\ which can be implemented as a pipeline or by feedback. A serial implementation is also shown in Figure 30 
(when the switches are in position 5. For those applications where the systematic bits are not transmitted or for parallel 
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concatenated trellis codes with high-level modulation, we should set Z Q =0. Even in the presence of systematic bits, if 
desired, one can set Jo = 0 and consider y 0 as part of yi If the systematic bits are distributed among encoders, we use the 
same distribution fory 0 among the received observations for MAP decoders. 

At this point, further approximation for iterative decoding is possible if one term corresponding to a sequence u 
dominates other terms in the summation in the numerator and denominator of Equation (72). Then the summations in Equation 
(72) can be replaced by "maximum" operations with the same indices, i.e., replacing with max for / =0,L A 

similar approximation can be used for £ 2 * and Ln in Equation. (75). This sub-optimal decoder then corresponds to an 
iterative decoder that uses AL-BCJR rather than BCJR decoders. As discussed, such approximations have been used by 
replacing, with "max" in the log-BCJR algorithm to obtain AL-BCJR. Clearly, all versions of SW-BCJR can replace 
BCJR (MAP) decoders in Figure 30. 

For turbo codes with only two constituent codes, Equation (77) reduces to 

zr" - ar } /(yj . lo> is 0 .*) 

Z¥" = a<T } f(y 2 . L 0 > ZT } .k) 
for k =1,2, and m ~1,2. •» where, for each iteration, a { m) and a f } can be optimized (simulated annealing) or set to / for 
simplicity. The decoding configuration for two codes is shown in Figure 31. In this special case, since the paths in Figure 31 
are disjointed, the decoder structure can be reduced to a serial mode structure if desired. If we optimize a\ mt and a %" } , this 
requires estimates of the variances of i ]k and L 2k for each iteration in the presence of errors. 

In the results presented in the next section, we will use a PMCCC with only two constituent codes. 
1.3.6. Simulation Results 

In this section, we will present some simulation results obtained applying the iterative decoding algorithm, which, in 
turn, uses the optimum BCJR and the sub-optimal, but simpler, SWAL2-BCJR as embedded MAP algorithms. All simulations 
refer to a rate 1/3 PMCCC with two equal, recursive convolutional constituent codes with 16 states and generator matrix 

1 + D + D 3 + D 4 



G(D) = 



I + D 3 + D 4 

and an interleaver of length 16,384, using an S-random permutation with S = 40. Each simulation run examined at least 
25,000,000 bits. In Figure 32, we plot the bit-error probabilities as a function of the number of iterations of the decoding 
procedure using the optimum block BCJR algorithm for various values of the signal-to-noise ratio. It can be seen that the 
decoding algorithm converges down to bit error rate (BER) = 10" 5 at signal-to-noise ratios of 0.2 dB with nine iterations. The 
same curves are plotted in Figure 33 for the case of the sub-optimum SWAL2-BCJR algorithm. In this case, 0.75 dB of 
signal-to-noise ratio is required for convergence to the same BER and with the same number of iterations. 

In Figure 34, the bit-error probability versus the signal-to-noise ratio is plotted for a fixed number (5) of iterations of 
the decoding algorithm and for both optimum BCJR and SWAL2-BCJR MAP decoding algorithms. It can be seen that the 
penalty incurred by the sub-optimum algorithm amounts to about 0.5 dB. 

Algorithms were of the block type. The penalty is completely attributable to the approximation of the sum of 
exponentials. To verify this, we have used a SW2-BCJR and compared its results with the optimum block BCJR, obtaining the 
same results. 

Finally, in Figures 35 and 36, we plot the number of iterations needed to obtain a given bit-error probability versus 
the bit signal-to-noise ratio, for the two algorithms. These curves provide information on the delay incurred to obtain a given 
reliability as a function of the bit signal-to-noise ratio. 
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1.3.7 Circuits to Implement the MAP Algorith m for Decoding Rate 1/n Component Codes of a PMCCC 

We show the basic circuits required for the implementation of a serial additive MAP algorithm for both block 
log-BCJR and SW-log-BCJR. Extension to a parallel implementation is straightforward. Figure 37 shows the implementation 
of Equation (50) for the forward recursion using a lookup table for evaluation oilog(l+e* ) t and subtraction of maxj{A k (S j)J 
from Ak(S J is used for normalization to prevent buffer overflow. The circuit for maximization can be implemented simply by 
using a comparator and selector with feedback operation. Figure 38 shows the implementation of Equation (51) for the 
backward recursion, which is similar to Figure 37. A circuit for computation of log(P * (u\y)) from Equation (36) using 
Equation (52) for final computation of bit reliability is shown in Figure 39, In Figure 39, switch 1 is in position 1 and switch 2 
is open at the start of operation. The circuit accepts Zk (S , ,u) for i - /, then switch 1 moves to position 2 for feedback 
operation. The circuit performs the operations for i =1,2. - ,N. When the circuit accepts £k(S,,u) for i ^ N , switch 1 goes to 
position 1 and switch 2 is closed This operation is done for «=/ and u=0. The difference between log (P k (l\y)) and log (P k 
(0\y)) represents the reliability value required for turbo decoding, i.e., the value of L k in Equation (67). 

We propose two simplifications to be used for computation of log (I+e*) without using a lookup table. 

Approximation I: We used the approximation log(l+e x ) *-ax+ b , 0 < x < b/a where b=log(2), and we selected a 
=0.3 for the simulation. We observed about O.ldB degradation compared with the full MAP algorithm for the code. The 
parameter a should be optimized, and it may not necessarily be the same for the computation of Equation (50), Equation (51), 
and log(Pk(u\y)) from Equation (36) using Equation (52). We call this "linear" approximation. 

Approximation 2: We take 

0 if x > rj 



log0 + e* x ) 

c if x < tj 

We selected c=log(2J and the threshold rj =1.0 for our simulation. We observed about 0.2 dB degradation compared 
with the full MAP algorithm for the code. This threshold should be optimized for a given SNR, and it may not necessarily be 
the same for the computation of Equation (50), Equation (51), and log(P k (u\y)) from Equation (36) using Equation (52). If we 
use this approximation, the log-BCJR algorithm can be built based on addition, comparison, and selection operations without 
requiring a lookup table, which is similar to a Viterbi algorithm implementation. We call this "threshold" approximation. At 
most, 8 to 10 bit representation suffices for all operations. 
1.3.8 Trellis Termination 

If needed, the encoder in Figure 28 may generate a n(N + M), N) block code, where the M tail bits of encoder 2 and 
encoder 3 are not transmitted. Since the component encoders are recursive, it is not sufficient to set the last M information bits 
to zero in order to drive all the encoder to the all zero state, i.e. to terminate the trellis. The termination (tail) sequence depends 
on the state of each component encoder after N bits, which makes it impossible to terminate the component encoders with just 
Ambits. This issue has not been resolved in previously proposed turbo code implementations. Fortunately, the simple stratagem 
illustrated in Figure 33 is sufficient to terminate the trellis at the end of the block. (The code shown is not important). Here the 
switch is in position **A" for the first N clock cycles and is in position "B" for M additional cycles, which will flush the 
encoders with zeros. The decoder does not assume knowledge of the A/ tail bits. The same termination method may be used for 
all encoders. 

1,3.9. Weight Distribution 

In order to estimate the performance of a code, it is necessary to have information about its minimum distance, weight 
distribution, or actual code geometry, depending on the accuracy required for the bounds or approximations. The challenge is in 
finding the pairing of codewords from each individual encoder, induced by a particular set of interleaves. Intuitively, we would 
like to avoid joining low- weight codewords from one encoder with low-weight words from the other encoders. In the example 
of Figure 28, the component codes have distances 5, 2 and 2. This will produce a worst-case minimum distance of 9 for the 
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overall code. Note that this would be if the encoders were not recursive since, in this case, the minimum weight word for all 
three encoders is generated by the input sequence u=(00...OOOOIO0...0OO) with a single "1", which will appear again in the 
other encoders, for any choice of interleavers. This motivates the use of recursive encoders, where the key ingredient is the 
recursiveness and not the fact that the encoders are systematic. For our example, the input sequence u=(00...00100IOO...OQO) 
generates a low weight codeword with weight 6, for the first encoder. If the interleaves do not "break" this input pattern, the 
resulting code-words weight will be 14. In general weight-2 sequences with 2+3/ zeros separating the 1 >s would result in a total 
weight of 14+67 if there were no permutations, permutations before the second and third encoders, a 2 sequence with its Vs 
separated by 2+3/i zeros will be permuted into two other weight-2 sequences with Vs separated by 2+3/, zeros, ;=2,3,.. where 
each ti is defined as a multiple of 1/3. If any /, is not an integer, the corresponding encoded output will have a high weight 
because then the convolutional code output is non-tenturiating (until the end of the block). If all /,'s are integers, the total 
encoded weight will be 14+2£i~, 3 /, . Thus, one of the considerations in designing the interleaver is to avoid integer triplets (/;, 
/_\ ti) that are simultaneously small in all three components. In fact, it would be nice to design an interleaver to guarantee that 
the smallest value of Zt-i 3 tt (for integer /,) grows with the block size M 

For comparison we consider the same encoder structure in Figure 28, except with the roles of ga and gb reversed. 
Now the minimum distances of the three component codes are 5, 3, and 3, producing an overall minimum distance of 1 1 for the 
total code without any permutations. This is apparently a better code, but it turns out to be inferior as a turbo code. This 
paradox is explained by again considering the critical weight-2 data sequences. For this code, weight-2 sequences with 1+2/, 
zeros separating the two Vs produce self-terminating output and hence low-weight encoded words. In the turbo encoder, such 
sequences will be permuted to have separations 1+2/,, 2, 3, for the second and third encoders, where now each /, is defined 
as a multiple of 1/2. But now the total encoded weight for integer triplets (/,, o, t 3 ) is /,. Notice how this weight 

grows only half as fast with 2J«/ 5 // as the previously calculated weight for the original code. If 2J=//, can be made to grow 
with block size by proper choice of interleaver, then clearly it is important to choose component codes that cause the overall 
weight to grow as fast as possible with the individual separations /,. This consideration outweighs the criterion of selecting 
component codes that would produce the highest minimum distance if un permuted. 

There are also many weight-/!, n - 3, 4, 5, data sequences that produce self-terminating output and hence low 
encoded weight. However, these sequences are much more likely to be broken up by the random interleaves than the weight-2 
sequences and are therefore likely to produce non-terminating output from at least one of the encoders. Thus, turbo code 
structures, which would have low minimum distances if unpermuted, can still perform well if the low- weight codewords of the 
component codes are produced by input sequences with weight higher of two. 
1 .3.10. Weight Distribution with random interleavers 

Now we briefly examine the issue of whether one or more random interleavers can avoid matching small separations 
between the Ts of a weight-2 data sequence with equally small separations between the Vs of its permuted versions). 
Consider for example a particular weight-2 data sequence (.. .001001000...) which corresponds to a low weight codeword in 
each of the encoders of Figure 28. If we randomly select an interleaver of size K the probability that this sequence will be 
permuted into another sequence of the same form is roughly 2/N (assuming that N is large, and ignoring minor edge effects). 
The probability that such an unfortunate pairing happens for at least one possible position of the original sequence 
(...001001000... ) within the block size of M is approximately I-(l~2/N) N ^l-e 2 . This implies that the minimum distance of a 
two-code turbo code constructed with a random permutation is not likely to be much higher than the encoded weight of such an 
unpermuted weight-2 data sequence, e.g. 14 for the code in Figure 28, (For die worst case permutations, the d min of the code is 
still 9, but these permutations are highly unlikely if chosen randomly). By contrast, if we use three codes and two different 
interleavers, the probability that a particular sequence (...001001000..,) will be reproduced by both interleavers is only (Z X)* . 
Now the probability of finding such an unfortunate data sequence somewhere within the block of size /V is roughly 7-<7-2^V/ ^ 
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4/N. Thus it is probable that a three-code turbo code using two random interleavers will see an increase in its minimum 
distance beyond the encoded weight of an unpermuted weight-2 data sequence. This argument can be extended to account for 
other weight-2 data sequences which may also produce low weight codewords, e.g. (...00100(000)' 1000...), for the code in 
Figure 28. For comparison, let us consider a weight-3 data sequence such as (...001 1100...) which for our example corresponds 
5 to the minimum distance of the code (using no permutations). The probability that this sequence is reproduced with one 
random interleaver is roughly 6/Af 7 , and the probability that some sequence of the form (...001 1 100...) is paired with another of 
the same form is l-(l-6/rf) N & 6/N. Thus for large block sizes, the bad weight-3 data sequences have a small probability of 
being matched with bad weight-3 permuted data sequences, even in a two-code system. For a turbo code using q codes and q-\ 
random interleavers this probability is even smaller, l-(l-(6/N l f i f i * (6/N)(6/N*f 3 . This implies that the minimum distance 

10 codeword of the turbo code in Figure 28 is more likely to result from a weight-2 data sequence of the form (...001 001 000... ) 
than from the weight-3 sequence (...0011100...) that produces the rrunimum distance in the unpermuted version of the same 
code. Higher weight sequences have even smaller probability of reproducing themselves after being passed through a random 
interleaver. For a turbo code using q codes and q-\ interleavers, the probability that a weight-// data sequence will be 
reproduced somewhere within the block by all q-l permutations is of the form l~(l-(PfN*' l y l ) N where /? is a number that 

1 5 depends on the weight-/! data sequence but does not increase with block size N. For large N t this probability is proportional to 
//AT 7 '"'* 7 , wltich falls off rapidly with N t when n and q are greater than two. Furthermore, the symmetry of this expression 
indicates that increasing either the weight of the data sequence « or the number of codes q has roughly the same effect on 
lowering this probability. In summary, from the above arguments we conclude that weight-2 data sequences are an important 
factor in the design of the component codes, and that higher weight have decreasing importance. Also, increasing the number of 

20 coders may result in better turbo codes. 

The minimum distance is not the most important quantity of the turbo code, except for its asymptotic performance, at 
very high Et/N 0 ■ At moderate SNRs, the weight distribution for the first several possible weights is necessary to compute the 
code performance. Estimating the complete weight distribution of these codes for large N and fixed interleavers is still an open 
problem. However, it is possible to estimate the weight distribution for large N for random interleavers by using probabilistic 

2 5 arguments. 

1.3.11. Interleaver design 

Interleavers should be capable of spreading low- weight input sequences so that the resulting codeword has high 
weight. Block interleavers, defined by a matrix with wr rows and vc columns such that N =ur x uc, may fail to spread certain 
sequences. For example, the weight 4 sequence shown in Figure 39 cannot be broken by a block interleaver. In order to break 

30 such sequences random interleavers are desirable. Block interleavers are effective if the low- weight sequence is confined to a 
row. If low- weight sequences (which can be regarded as the combination of lower weight sequences) are confined to several 
consecutive rows, then the uc columns of the interleaver should be sent in a specified order to spread as much as possible the 
low- weight sequence. As can be observed in the example in Figure 39, the sequence 1001 will still appear at the input of the 
encoders for any possible column permutation. Only if we permute the rows of the interleaver in addition to its columns it is 

35 possible to break the low-weight sequences. Appropriate selection of o, and q for rows and columns depends on the particular 
set of codes used and on the specific low- weight sequences that we would like to break. We have also designed random 
permutations (interleavers) by generating random integers / ,1 < / £ N t without replacement. We define a "S-random" 
permutation as follows: * J each randomly selected integer is compared to S previously selected integers. If the current selection is 
equal to any S previous selections vvithin a distance of ± 5, then the current selection is rejected. This process is repeated until 

40 all N integers are selected. While the searching time increases with S. we observed that choosing 5 < A*2) 1 ' 2 usually produces 
a solution in reasonable time. (For S ~ 1 we have a purely random interleaver). In the simulations we used 5 = 11 for N = 256 
andS = 31 for V = 4096. 
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The advantage of using ihree or more constituent codes is that the corresponding two or more mterlcavers have a 
better chance to break sequences that were not taken care by another interleaver. The disadvantage is that, for an overall 
desired code rate, each code must be punctured more, resulting in weaker constituent codes. It has been shown that randomly 
selected interieavers and interieavers based on the row-column permutation described above. In general, randomly selected 
permutations are good for low SNR operation (e.g., PCS applications requiring P b (e) = 10" 3 ) where the overall weight 
distribution of the code is more important than the minimum distance. 
1.3.12. Performance with two codes 

The performance obtained by turbo decoding the code with two constituent codes (1, gb/ga ), where ga =(37)ociaI 
and gb =(2\)octai , and with random permutations of lengths N = 4096 and (Note that the components of the Z 's 
corresponding to the tail bits are set to zero for all iterations). N = 16384 is compared in Figure 40 to the capacity of a binary- 
input Gaussian channel for rate r = 1/4. The best performance curve in Figure 40 is approximately 0.7 dB from the Shannon 
limit at BER^IO" 4 . 

1-3.13. Performance with Unequal Rate Encoders. 

We now extend the results to encoders with unequal rates with two A' = 5 constituent codes (1 gb/ga , gaga) and 
(gb/ga ) y where ga = (37)octal , gb =(33)octo/ and gc =(25)octal . This structure improves the performance of the overall, rate 
1/4, code, as shown in Figure 40. This improvement is due to the fact that we can avoid using the interleaved information data 
at the second encoder and that the rate of the first code is lower than that of the second code. For PCS applications, short 
interleaver should be used, since the vocoder frame is usually 20ms. Therefore 192 and 256 bits interieavers as an example, 
corresponding to 9.6 and 13 Kbps. The performance of codes with short interleaver is shown in Figure 41 for the K— 5 codes 
described above for random permutation and row-column permutation with a = 2 for rows and a = 4 for columns. 
1.3.14. Performance with Three Codes. 

The performance of a three-code turbo code with random interieavers is showing Figure 42 for A^=4096. The three 
recursive codes shown in Figure 28 where used for K = 3. Three recursive codes with ga = (13)octa/ and gb =(1 \)octal 
were used for K = 4. Using K~ A code has better performance than several others. In Figure 42, the performance of the K = 4 
code was improved by going to 30 iterations and using a S-random interleaver with S=3 1 . For shorter blocks ( 1 92 and 256), the 
results are shown in Figure 4 1 where it can be observed that approximately 1 dB SNR is required for BER=10* 3 , which implies 
a CDMA capacity C = 0.8n We have noticed that the slope of the BER curve changes around BER-10' 5 (flattening effect) if 
the interleaver is not designed properly to maximize d min or is chosen at random. 
1 .4. Comparison Between Parallel and Serially Multiple Concatenated Codes 

In this section, we will use the bit-error probability bounds previously derived to compare the performance of parallel 
and serially multiple concatenated block and convolutional codes. 
1 .4. 1 Parallel and Serially Multiple Concatenated Block Codes 

To obtain a fair comparison, we have chosen the following PMCBC and SMCBC: The PMCBC has parameters (1 lm, 
3m, N) and employs two equal (7,3) systematic cyclic codes with generator g(D)=(l+D) (I + D + D 3 )\ the SMCBC, instead, 
is a (15m, 4m, N) SMCCC obtained by the concatenation of the (7, 4) Hamming code with a (15, 7) BCH code. 

They have almost the same rates (R c , =0.266 and R Cp =0.2731 and have been compared choosing the interleaver 
length in such a way that the decoding delay due to the interleaver, measured in terms of input information bits, is the same. As 
an example, to obtain a delay equal to 12 input bits, we must choose an interleaver length N=!2 for the PMCBC and vV 
^12/R° c =2i for the SMCBC The results are shown in Figure 25, where we plot the bit-error probability versus the 
signal-to-noise ratio Et/N 0 for various input delays. The results show that, for low values of the delay, the performances are 
almost the same. On the other hand, increasing the delay (and thus the interleaver length AO yields a significant interleaver sain 
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for the SMCBC and almost no gain for the PMCBC. The difference in performance is 3 dB at P b ( e ) =10 * in favor of the 
SMCBC. 

J ,4.2. Parallel and Serially Multiple Concatenated Convolutional Codes 

To obtain a fair comparison, we have chosen the following PMCCC and SMCCC: The PMCCC is a rate 1/3 code 
obtained concatenating two equal rate 1/2, four-state systematic recursive convolutional codes with a generator matrix as in the 
first row of Table 2. The SMCBC is a rate 1/3 code. It is form using as an outer code the same rate 1/2, four-state code as in the 
PMCCC and, as an inner code, a rate 2/3, four-state systematic recursive convolutional code with a generator matrix as in the 
third row of Table 2. Also, in this case, the interleaver lengths have been chosen so as to yield the same decoding delay, due to 
the interleaver, in terms of input bits. The results are shown in Figure 44, where we plot the bit-error probability versus the 
signal-to-noise ratio Ei/Nq for various input delays. 

The results show the great difference in the interleaver gain. In particular, the PMCCC shows an interleaver gain 
going as//-/, whereas the interleaver gain of the SMCCC, as from Expression (31), goes as N - ( d° f + 1)/2 = N -3 , since the 
free distance of the outer code is equal to 5, which is odd. This means, for P b (e) =10 '", a gain of more than 2 dB in favor of 
the SMCCC. 

Previous comparisons have shown that serial concatenation is advantageous with respect to parallel concatenation in 
terms of maximum-likelihood performance. For long interleaver lengths, this significant result remains a theoretical one, as 
maximum-likelihood decoding is an almost impossible achievement. 
2. A SIS O MAP Module to Decode Parallel and Serial Multiple Concatenated Codes 

Multiple concatenated coding schemes with interleavers comprises a combination of two simple constituent encoders 
and an interleaver. The parallel concatenation has been shown to yield remarkable coding gains close to theoretical limits, yet 
admitting a relatively simple iterative decoding technique. The serial concatenation of interleaved codes may offer a superior 
performance. In both coding schemes, the core of the iterative decoding structure is a soft-input soft-output (SISO) module. 
Here, we describe the SISO module in a form that continuously updates the maximum a posteriori (MAP) probabilities of input 
and output code symbols and show how to embed it into iterative decoders for parallel and serially concatenated codes. 

2. 1 . Introduction 

Both concatenated coding schemes admit a suboptimum decoding process based on the iterations of the MAP 
algorithm applied to each constituent code. Here we describe a SISO module that implements the MAP algorithm in its basic 
form, the extension of it to additive MAP (log-MAP), which is indeed a dual -generalized Viterbi algorithm with correction, and 
finally extension to the continuous decoding of PMCCC and SMCCC. As examples of applications, we will show the results 
obtained by decoding two low-rate codes, with very high coding gain. 

2.2. Iterative Decoding of Parallel and Serial Concatenated Codes 

In this section, we show the block diagram of parallel and serially concatenated codes, together with their iterative 
decoders, both iterative decoding algorithms need a particular module, named SISO, which implements operations strictly 
related to the MAP algorithm. 
2.2.1. Parallel Concatenated Codes 

The block diagram of a PMCCC is shown in Figure 45 (a) (the same construction also applies to block codes). In 
Figure 45, a rate 1/3 PMCCC is obtained using two rate 1/2 constituent codes (CCs) and an interleaver. For each input 
information bit, the codeword sent to the channel is formed by the input bit, followed by the parity check bits generated by the 
wo encoders. In Figure 45 (b), the block diagram of the iterative decoder is also shown. It is based on two modules denoted by 
"SISO/ 1 one for each encoder, an interleaver, and a deinterleaver forming the inverse permutation with respect to the 
interleaver. 

The SISO module is a four-port device (quadriport), with two inputs and two outputs. Here, it suffices to say that it 
accepts as inputs the probability distributions of the information and code symbols labeling the edges of the code trellis, and 
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forms as outputs an update of these distributions based upon the code constraints. In Figure 45 (b) can be seen that the updated 
probabilities of the code symbols, are never used by the decoding algorithm. 

2.2.2. Serially Multiple Concatenated Codes 

The block diagram of a SMCCC is shown in Figure 46 (a) (the same construction also applies to block codes). In 
Figure 46 (a), a rate 1/3 SMCCC is obtained using as an outer encoder a rate 1/2 encoder, and as an inner encoder a rate 2/3 
encoder. An interleaver permutes the output codewords of the outer code before passing them to the inner code. In Figure 46 
(b), the block diagram of the iterative decoder is shown. It is based on two modules denoted by "SISO" one for each encoder, 
an interleaver, and a de interleaver. The SISO module is the same as described before. In this case, though, both updated 
probabilities of the input and code symbols are used in the decoding procedure. 

2.2.3. Soft-Output algorithms 

The SISO module is based on MAP algorithms. These algorithms perform both forward and backward recursions and, 
thus, require that the whole sequence be received before starting the decoding operations. As a consequence, they can only be 
used in block-mode decoding. The memory requirement and computational complexity grow linearly with the sequence length. 

Some algorithms require only a forward recursion, so that it can be used in continuous-mode decoding. However, its 
memory and computational complexity grow exponentially with the decoding delay. It is possible to use a MAP 
symbol-by-symbol decoding algorithm conjugating the positive aspects of other algorithms, i.e., a fixed delay and linear 
memory and complexity growth with decoding delay. All these algorithms are truly MAP algorithms. To reduce the 
computational complexity, various forms of suboptimum soft-output algorithms can be used. Two approaches have been taken. 
The first approach tries to modify the Viterbi algorithm. These augmented outputs include the depth at which all paths are 
merged, the difference in length between the best and the next-best paths at the point of merging, and a given number of the 
most likely path sequences. The same concept of augmented output was later generalized for various applications. A different 
approach to the modification of the Viterbi algorithm comprises of generating a reliability value for each bit of the hard-output 
signal and is called the soft-output Viterbi algorithm (SOVA). In the binary case, the degradation of SOVA with respect to 
MAP is small; however, SOVA is not as effective in the nonbiliary case. The second approach comprises of revisiting the 
original symbol MAP decoding algorithms with the aim of simplifying them to a form suitable for implementation. 
2.3, The SISO Module 
2.3.1. The Encoder 

The decoding algorithm underlying the behavior of SISO works for codes admitting a trellis representation. It can be 
a time-invariant or time-varying trellis, and, thus, the algorithm can be used for both block and convolutional codes 

In Figure 47, we show a trellis encoder, characterized by the following quantities (In the following, capital letters U. 
C. S. E will denote random variables and lower-<^se letters u, c t s, e their realizations. The letter P.fAJ will denote the 
probability of the event /I, whereas the letter P(a) will denote a function of a. The subscript k will denote a discrete time, 
defined on the lime index set K. Other subscripts, like will refer to elements of a finite set. Also, "( )" will denote a time 
sequence, whereas *Y F will denote a finite set of elements.): 

1. U =( Uk) keK is the sequences of input symbols, defined over a time index set K (finite or infinite) and drawn from the 
alphabet U = { u N{ J . To the sequence of input symbols, we associate the sequence of a priori probability 
distributions: P.(u;I) = [P k (u k ; V) kGfC where P k (u k ;D = P.fUt = u k ] ■ 

2. C=( C k ) keK is the sequences of output, or code, symbols, defined over die same time index set K, and drawn from the 
alphabet C = j c / c No } To the sequence of output symbols, we associate the sequence of a priori probability 
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distributions: P.(c; I) = (Pk(ck : 0) k&K . For simplicity of notation, we drop the dependency of k* and c* on k. Thus, 

Pk(uk ;I) and Pkfc k ;I) will be denoted simply by P k (u; I) and P tfc ; I) respectively. 
2.3.2. The Trellis Section 

The dynamics of a time-invariant convolutional code are completely specified by a single trellis section, which 
describes the transitions (edges) between the states of the trellis at time instants k and k +1. A TrellisO section is 
characterized by the following: 

( 1 ) A set of N states S «= { S i , - ■ . s n } The state of the trellis at time k is S k = s, with seS. 

(2) A set of JV xNj edges obtained by the Cartesian product E - Sx U = { ei , ... . bnxn, } whi ch represents all possible 
transitions between the trellis states. 

The following functions are associated with each edge e e- E (see Figure 48): 

( 1 ) The starting state / (e) (the projection of e onto S). 

(2) The ending state s* (e). 

(3) The input symbol u(e) (the projection of e onto U). 

(4) The output symbol c(e). 

The relationship between these functions depends on the particular encoder. As an example, in the case of systematic 
encoders, (e) c(e)) also identifies the edge since u(e) is uniquely determined by c(e). In the following, we only assume that 
the pair (<? (e),u(e)) uniquely identifies the ending state ^ (e)\ this assumption is always verified, as it is equivalent to say that, 
given the initial trellis state, there is a one-to-one correspondence between input sequences and state sequences, a property 
required for the code to be uniquely decodable. 
2.3.3. The SISO Algorithm 

The SISO module is a four-port device that accepts at the input the sequences of probability distributions P.(c; I) 
P.(u: I) and outputs the sequences of probability distributions P. (c;0) P.(u;0) based on its inputs and on its knowledge of the 

trellis section (or code in general). We assume first that the time index set K is finite, i.e., K = {I n}. The algorithm by 

which the SISO operates in evaluating the output distributions will be explained in two steps. In the first step, we consider the 
following algorithm: 

( 1 ) At time k, the output probability distributions are computed as 



P k (oiO) = H c Z A k .i[s s (e)] P k [u(e);I] P k [c(e);I] B k [ s E (e)] (78) 



t:c(t)**c 



P k (u:0) = h u Z A k .i[s s (e)] P k [u(e);I] P k [c(e);J] B k [s E (e)] (79) 



e:c{e^u 



(2) The quantities A k C) and B t () are obtained through the forward and backward recursions, respectively, as 



A k (s) = I A k -j[s s (e)] P k [u(e);I] p k [c(e);I] , k - J n (80) 



Bk(s) = Z B k+ ifs E (e)J P k+l [u(e);J] P k+l [c(e);I] ,k = n-l 0 (81) 



with initial values 




(82) 




(83) 



The quantities fj c , fj u are normalization constants defined as follows: 
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H c Z ? k (c;0) = 1 

c 

Hu Z P k (u;0) - ; 

u 

In the second step, from Equations (78) and (79), it is apparent that the quantities P k [c(e); I] in the first equation and 
P k [u(e); I J in the second do not depend on e y by definition of the summation indices, and thus can be extracted from the 
summations. Thus, defining the new quantities 

Pk( c:0) - Hc l^2L 
Pk(c:I) 

Pk( u;o } - Hm £dOi£L 

Pk(u;I) 

where H c and H u are normalization constants such that 

H c -+ Z P k (c:0) = 7 

c 

H u ^ T, P k (u;0) « / 

u 

It can be easily verified that they can be obtained through the expressions 

P k (c;0) = He He' A k .rfs s (e)] p k [u(e);I] Bk[s E (e)] (84) 

Pk(u:0) ^ Hu H u Z A k -t[s s (e)] p k [c(e);I] B k [s E (e)] (85) 

e;c(e)=u 

where them's and #'5 satisfy the same recursions previously introduced in Equation (80). 

The new probability distributions P k (u;0) and P k (c;0) represent a smoothed version of the input distributions P k 
(c:I) and P k (u;l), based on the code constraints and obtained using the probability distributions of all symbols of the sequence 
but the kth ones, P k (c;I) and P k (u;I). In the literature of PMCCC decoding, P k (u;0) and P k (c;0) would be called extrinsic 
information- They represent the added value of the SISO module to the a priori distributions P k (u;I) and Pk(c;I). Basing the 
SISO algorithm on P k (;0) instead of on p k (.; O) simplifies the block diagrams, and related software and hardware, of the 
iterative schemes for decoding concatenated codes. The SISO module is then represented as in Figure 49. 

Previously proposed algorithms were not in a form suitable for working with a general trellis code. Most of them 
assumed binary input symbols, some also assumed systematic codes, and none (not even the original BCJR algorithm) could 
cope with a trellis having parallel edges. As can be noticed, (from all summations involved in the equations that define the 
SISO algorithm) we work on trellis edges rather than on pairs of states. . This makes the algorithm completely general and 
capable of coping with parallel edges and also with encoders with rates greater than one, like those encountered in some 
concatenated schemes. 

2.3.4. Computation of Input and Output Bit Extrinsic Information 

In this subsection, bit extrinsic information is derived from the symbol extrinsic information using Equations (84) and 
(85). Consider a rate trellis encoder such that each input symbol U comprises of k a bits and each output symbol C 
comprises of «„ bits. Assume 

Pk(c;D = n P k .j (c J : I) (86) 

j=i 

Pk(u:I) = H P k .j (uj.l) (87) 

where c J a {0,1} denotes the value of the jth bit C J k of the output symbol C * = c; j =1 n Q , and u J e {0,1} denotes the 

value of the jth bit V J k of the output symbol U k = u; j ■■=! k c , . This assumption is valid in an iterative decoding when bit 

35 



WO 00/07323 



PCT/US99/17369 



interleaves rather than symbol interleaves are used. One should be cautious when using P k (c; I) as a product for those 
encoders in a concatenated system where the output C in Figure 47 is connected to a channel. For such cases, if, for an 
example, a nonbiliary input additive white Gaussian noise (AWGN) channel is used, this assumption usually is not needed (this 
will be discussed shortly), and P k (c; I)=P M (c\y)=P k (y\x(c))P (c)/P (y), where y is the complex received samples) and x(c) is 
the transmitted nonbiliary symbol(s). Then, for binary input memoryless channels, P k (y\x(c)) can be written as a product. After 
obtaining symbol probability distributions P k (c; O) and P k (u; O) from Equations (84) and (85) by using Equations (79) and 
(81), it is easy then to show that the input and output bit extrinsic information can be obtained as 

Pk, } (c J :0) = Z P k (c;0) ft Pk. J (c i ;D (88) 

c:c{~c* i=/i>> 

Pk,j(u J ;0) = Hyt Z P k (u;0) ft P^u'il) (89) 
where H <j and H ^ are normalization constants such that 

ftct -> Z Pk,j(c f :0) = 1 

c* e fa J} 

-> Z Pk,j(u i ;0) - / 

Equation (86) is not used for those encoders in a concatenated coded system connected to a channel. To keep the 
expressions general, as is seen from Equations (80), (8 1 ), and (89), Pk[c(e);I] is not represented as a product. 

In the following sections, for simplicity of notation, the probability distribution of symbols rather than of bits is 
considered. The extension of the results to probability distributions of bits based on the above derivations is straightforward. 
2.4. The Sliding- Window Soft-Input Soft-Output Module (SW-SISO^ 

As previous description should have made clear, the SISO algorithm requires that the whole sequence has been 
received before starting the smoothing process. The reason is due to the backward recursion that starts from the 
(supposed-known) final trellis state. As a consequence, its practical application is limited to the case when the duration of the 
transmission is short {n small) or, for n long, when the received sequence can be segmented into independent consecutive 
blocks, like for block codes or convolutional codes with trellis termination. It cannot be used for continuous decoding of 
convolutional codes. This constraint leads to a frame, that rigidity imposed on the system and also reduces the overall code rate. 

A more flexible decoding strategy is offered by modifying the algorithm. This modification is in such a way, that the 
SISO module operates on a fixed memory span and outputs the smoothed probability distributions after a given delay D. This 
new algorithm is called the sliding- window soft-input soft-output (SW-SISO) algorithm (and module). We propose two 
versions of the SW-SISO that differ in the way they overcome the problem of initializing the backward recursion without 

waiting for the entire sequence. From now on, we assume that the time index set AT is semi-infinite, i.e., K=fl oo}, and that 

the initial state so is known. 

2.4.1. First Version of the Sliding- Window SISO Algorithm fSW-SISOH 
The SW-SISO 1 algorithm comprises of the following steps: 

( 1 ) Initialize A 0 according to Equation (82). 

(2) Forward recursion at time k: Compute the A k through the forward recursion of Equation (80). 

(3) Initialization of the backward recursion (time k > D): 

Bf } (s) = A k (s) Vs (90) 

(4) Backward recursion: It is performed according to Equation (81 ) from iterations jW to as: 
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B®(s) = Z Bi'Jlils B (e)] P k [u(e);I] P k [c(e);I] (91) 

and 

Bic.d(s) = (92) 
(5) The probability distributions at time k -D are computed as 

Pk-D(c;0) = He H c Z A k -D-i[s s (e)] P k . D [u(e);I] Bk-ol s E (e)J (93) 

Pk.o(u;b) = H c He Z A k -D-ils s (e)] P k . D [c(e);I] B k . D [s E (e)J (94) 

e:v(ef=u 

2.4.2. The Second Simplified Version of the Sliding- Window SISQ Algorithm (SW-SISQ2^ 

A further simplification of the sliding- window SISO algorithm, which is similar to SW-SISOl except for the 
backward initial condition, that significantly reduces the memory requirements comprises of the following steps: 

( 1 ) Initialize A 0 according to Equation (82). 

(2) Forward recursion at time k, k> D : Compute the A through the forward recursion 

Ak-Dfr) = Z Ak-D-ds s (e)] P kmD [u(e);I] P k , D [c(e)J] , k > D (95) 

(3) Initialization of the backward recursion (time k > D): 

B k °>(s) - 4? Vs < 96 > 

(4) Backward recursion (time k > D): It is performed according to Equation (9 1 ) as before. 

(5) The probability distributions at time k - D are computed according to Equations (93) and (94) as before. 

2.4.3. Memory and Computational Complexity 
2.4.3.1.. Algorithm SW-SISOl. 

For a convolutional code with parameters (k 0 ,n 0 ) and number of states N , so that Mi = 2*°and N Q = 2"°, 
the algorithm SW-SISOl requires storage of N *D values of A's and D(N / + N 0 ) values of the input unconstrained 
probabilities P k (u; I) and P k (c; I), Moreover, to update the A's and B's for each time instant, it needs to perform 2 *N * N j 
multiplications and N additions of /V/ numbers. To output the set of probability distributions at each time instant, we need a 
jD-times long backward recursion. Thus, overall the computational complexity requires the following: 2(D+l)*Nx N j 
multiplications and (D + i)*N*( N i -1) additions. 
2.4,3.2. Algorithm SW-SISQ2. 

This simplified version of the sliding-window SISO algorithm does not require the storage of the N*D values of A's y 
as they are updated with a delay of D steps. As a consequence, only N values of A's and D (Nj + N Q ) values of the input 
unconstrained probabilities P k (u; I) and P k (c; I) need to be stored. The computational complexity is the same as that for the 
previous version of the algorithm. However, since the initialization of the B recursion is less accurate, a larger value of D may 
be necessary. 

2.5. The Additive SISO Algorithm (A-SISO) 

The sliding-window SISO algorithms solve the problems of continuously updating the probability distributions, 
without requiring trellis terrninations. Their computational complexity, however, is still high when compared to other 
suboptimal algorithms like SOVA. This is due mainly to the fact that they are multiplicative algorithms. We overcome this 
drawback by proposing the additive version of the SISO algorithm. Clearly, the same procedure can be applied to its two 
sliding-window versions, SW-SISOl and SW-SIS02. 
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To convert the previous SISO algorithm from multiplicative to additive form, we exploit the monotonicity of the 
logarithm function, and use for the quantities P(ur),P(cr) t A, and B their natural logarithms, according to the following 
definitions: 

7t k (c;I) - log [P k (c;I)J 
7r k (u;l) = log [P k (u;I)] 
nu(c;0) = log [p k (c;0)J 
7t k (u;0) = log [P k (u;0)J 
ak(s) = log [Ak(s)J 
P k (s) = log [B k (s)] 

With these definitions, the SISO algorithm defined by Equations (84) and (85) and Equations (80) and (81) becomes 
the following: At time k y the output probability distributions are computed as 



7t k (ciO) = log 



n k (u;0) = log 



e:u(e)~u 



+ he (97) 



+ h u (98) 



where the quantities «t () and /?* (•) are obtained through the forward and backward recursions, respectively, as 
a k (s) = log Z eM**H + + jt = / „ (99) 

P k (s) = log Z + + 



{ ' 



k = /»-/ 0 (100) 

with initial values 

s = So) 
otherwise J 

-qo otherwise] 

The quantities // c and //» are normalization constants needed to prevent excessive growth of the numerical values of 
the a's and fi's. 

The problem in the previous recursions comprises in the evaluation of the logarithm of a sum of exponential like (in 

general): 



a = log 



(101) 



To evaluate a in Equation ( 101 X we can use two approximations, with increasing accuracy (and complexity). 
The first approximation is 



a = log 



* OA-/ (102) 



where we have defined 



clm - max { a, } , i - I 



30 



This approximation assumes that 
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a\r > > at . V/ = 7 L 

It is almost optimal for medium-high signal-to-noise ratios and leads to performance degradations of the order of 0.5 
to 0.7 dB for very low signal-to-noise ratios. 

Using Equation (102), the recursions of Equations (99) and (100) become 

5 a k (s) = max { at./^fcj] + x k [u(e) ; 1 ] + n k [c(ej ; I ] J . k = I n (103) 

P k (s) = ^maxj P k +\s E (e)\ + n k +i \u(e) ; l \ + [c(ej ; J ] } . k = n-1 0 (104) 

and the x*s of Equations (97) and (98) become 

n k (c;0) = max { a*-jk*M + \ u < e > : ! \ + Pk U E ( e > 1 \ + h c (105) 

e:cfe)=c L J J 

x k (u;0) = max { ai./f-^M + ' tt* [c^ ; /] + p k \ S E (e) 1 } + /, u (106) 

1 0 When the accuracy of the previously proposed approximation is not sufficient, we can evaluate "a " in Equation ( 101 ) 

using the following recursive algorithm: 



max { a f1 ' I} . a, } + log [ / + e '\ a( '' n ' a > I ] , / = 2 Z, 



15 To evaluate a, the algorithm needs to perform (L - 1) times two kinds of operations: a comparison between two 

numbers to find the maximum, and the computation of 

log [ / + e ( - A) ] A ;> 0 

The second operation can be implemented using a single-entry look-up table up to the desired accuracy. Therefore, a 

in Equation (101) can be written as a^dM+Sicn, a 2 . ... , ai)=maxi* {a J. The second term, 5 ( aj . a 2 ai). is called the 

20 correction term and can be computed using a look-up table, as discussed above. Now, if desired, max can be replaced by max* 
in Equations (103) through (106). 

Clearly, the additive form of the SISO algorithm can be applied to both versions of the sliding-window SISO 
algorithms described in the previous section, with straightforward modifications. 
2.6, Applications of the ASW-SISO Module 
25 Consider a PMCCC obtained using as constituent codes two equal rate 1/2 systematic, recursive, 16-state 

convolutional encoders with generating matrix 

1 + D + D 3 + D 4 



G(D) 



J,- 



1+ D 3 + D 4 

The interleaves- length is N= 16,384. The overall PMCCC forms a very powerful code for possible use in applications 
requiring reliable operation at very low signal-to-noise ratios. 

30 The performance of the continuous iterative decoding algorithm, applied to the concatenated code, is obtained by 

simulation, using the ASW-SISO and the look-up table algorithms. It is shown in Figure 50, where we plot the bit-error 
probability as a function of the number of iterations of the decoding algorithm for various values of the bit signal-to-noise ratio, 
Eb'No- It can be seen that the decoding algorithm converges up to an error probability of 10" 5 , for signal-to-noise ratios of 0.2 dB 
with nine iterations. Moreover, convergence is guaranteed also at signal-to-noise ratios as low as 0.05 dB, which is 0.55 dB 

3 5 from the Shannon capacity limit. 
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As a second example, we construct the serial concatenation of two convolutional codes (SMCCCs) using as an outer 
code the rate 1/2, 8 -slate nonrecursive encoder with generating matrix 

G(D) = 1 1 + D + D 3 , 1 + D ] 
and, as an inner code, the rate 1/2, 8-state recursive encoder with generating matrix 

/ + D + D 3 



G(D) = 



/ + D 



The resulting SMCCC has rate 1/4. The interleaver length has been chosen to ensure a decoding delay in terms of input 
information bits equal to 16,384. 

The performance of the concatenated code, obtained by simulation as before, is shown in Figure 51, where we plot 
the bit-error probability as a function of the number of iterations of the decoding algorithm for various values of the bit 
signal-to-noise ratio, Et/N 0 . It can be seen that the decoding algorithm converges up to an error probability of 10" 5 , for 
signal-to-noise ratios of 0.10 dB with nine iterations. Moreover, convergence also is guaranteed at signal-to-noise ratios as low 
as -0. 1 0 dB, which is 0.71 dB from the capacity limit. 

As a third, and final, example, we compare the performance of a PMCCC and an SMCCC with the same rate and 
complexity. The concatenated code rate is 1/3, the CCs are four-state recursive encoders (rates 1/2 + 1/2 for the PMCCCs and 
rates 1/2 + 2/3 for the SMCCCs), and the decoding delays in terms of input bits are equal to 16,384. In Figure 52, we report the 
bit-error probability versus the signal-to-noise ratio for six and nine decoding iterations. As the curves show, the PMCCC 
outperforms the SMCCC for high values of the bit-error probabilities. Below 10" 5 (for nine iterations), the SMCCC behaves 
significantly better and does not present the "floor" behavior typical of PMCCCs. In particular, at 10"*, the SMCCC has an 
advantage of 0.5 dB with nine iterations. 
3. ADSL systems 

Figures 53, 54, 55 and 56 are models for facilitating accurate and concise DMT signal waveform descriptions. In the 
Figures 53, 54, 55 and 56 Z, is DMT sub-carrier / (defined in the frequency domain), and x n is the /1 th IDFT output sample 
(defined in the time domain). The DAC and analog processing block construct the continuous transmit voltage waveform 
corresponding to the discrete digital input samples. More precise specifications for these analog blocks arise indirectly from 
the analog transmit signal linearity and power spectral density specifications. The use of Figures 53, 54, 55 and 56 as a 
transmitter reference model allows all initialization signal waveforms to be described through the sequence of DMT symbols, 
{Z/} , required to produce that signal. Allowable differences in the characteristics of different digital to analog and analog 
processing blocks will produce somewhat different continuous-time voltage waveforms for the same initialization signal. 
3.1. ATU-C transmitter reference models 

ATM and STM are application options. ATU-C and ATU-R may be configured for either STM bit sync transport or 
ATM cell transport. 

3.1.1 ATU-C transmitter reference model for STM transport 

Figure 53 is a block diagram of an ADSL Transceiver Unit-Central office (ATU-C) transmitter showing the 
functional blocks and interfaces for the downstream transport of STM data. 

The basic STM transport mode is bit serial. The framing mode used determines if byte boundaries, if present at the 
V-C interface, shall be preserved. Outside the ASx/LSx serial interfaces data bytes are transmitted MSB first AU serial 
processing in the ADSL frame (e.g., CRC, scrambling, etc.) shall, however, be performed LSB first, with the outside world 
MSB considered by the ADSL as LSB. As a result, the first incoming bit (outside world MSB) shall be the first processed bit 
inside the ADSL (ADSL LSB). ADSL equipment shall support at least bearer channels AS0 and LS0 downstream. Support of 
other bearer channels is optional. Two paths are shown between the Mux/Sync control and Tone ordering; the "fast" path 
provides low latency; the interleaved path provides very low error rate and greater latency. 
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An ADSL system supporting STM. shall be capable of operating in a dual latency mode for the downstream direction, 
in which user data is allocated to both paths (i.e. fast and interleaved). An ADSL system supporting STM, shall be capable of 
operating in a single latency mode for both the downstream and upstream directions, in which all user data is allocated to one 
path (i.e. fast or interleaved). An ADSL system supporting STM transport may be capable of operating in an optional dual 
latency mode for the upstream, in which user data is allocated to both paths (i.e. fast and interleaved). 
3. 1.2 ATU-C transmitter reference model for ATM transport 

Figure 54 is a block diagram of an ADSL Transceiver Unit-Central office (ATU-C) transmitter showing the 
functional blocks and interfaces that are referenced in ITU-T G.992.1 Recommendation for the downstream transport of ATM 
data. 

Byte boundaries at the V-C interface shall be preserved in the ADSL data frame; Outside the ASx/LSx serial interfaces 
data bytes are transmitted MSB first. All serial processing in the ADSL frame (e.g., CRC, scrambling, etc.) shall, however, be 
performed LSB first, with the outside world MSB considered by the ADSL as LSB. The first incoming bit (outside world 
MSB), will be the first processed bit inside the ADSL (ADSL LSB). The CLP bit of the ATM cell header will be carried in the 
MSB of the ADSL frame byte (i.e., processed last); ADSL equipment shall support at least bearer channel ASO downstream). 
Two paths are shown between the Mux/Sync control and Tone ordering; the "fast" path provides low latency, the interleaved 
path provides very low error rate and greater latency. An ADSL system supporting ATM transport shall be capable of operating 
in a single latency mode, in which all user data is allocated to one path (i.e. fast or interleaved). An ADSL system supporting 
ATM transport may be capable of operating in an optional dual latency mode, in which user data is allocated to both paths (i.e. 
fast and interleaved). 
3.2. ATU-R transmitter reference models 

ATM and STM are application options. ATU-C and ATU-R may be configured for either STM bit sync transport or 
ATM cell transport. 

3.2.1 ATU-R transmitter reference model for STM transport 

Figure 55 show a block diagram of an ATU-R transmitter showing the functional blocks and interfaces that are 
referenced in this Recommendation for the upstream transport of STM. 

The basic STM transport mode is bit serial. The framing mode used determines if byte boundaries, if present at the 
V-C interface, shall be preserved. Outside the LSx serial interfaces data bytes are MSB transmitted first. All serial processing 
in the ADSL frame (e.g., CRC, scrambling, etc.) shall, however, be performed LSB first, with the outside world MSB 
considered by the ADSL as LSB. As a result, the first incoming bit (outside world MSB) will be the first processed bit inside 
the ADSL (ADSL LSB). ADSL equipment shall support at least bearer channel LSO upstream. Two paths are shown between 
the Mux/Sync control and Tone ordering; the "fast" path provides low latency; the interleaved path provides very low error rate 
and greater latency. An ADSL system supporting STM shall be capable of operating in a dual latency mode for the downstream 
direction, in which user data is allocated to both paths (i.e. fast and interleaved). An ADSL system supporting STM shall be 
capable of operating in a single latency mode for both the downstream and upstream directions, in which all user data is 
allocated to one path (i.e. fast or interleaved). An ADSL system supporting STM transport may be capable of operating in an 
optional dual latency mode for the upstream, in which user data is allocated to both paths (i.e. fast and interleaved). 

3.2.2 ATU-R transmitter reference model for ATM transport 

Figure 56 show a block diagram of an ATU-R transmitter showing the functional blocks and interfaces that are 
referenced in this Recommendation for the upstream transport of ATM data. 

Byte boundaries at the T-R interface shall be preserved in the ADSL data frame. Outside the LSx serial interlaces 
data bytes are transmitted MSB first in accordance with iTU-T Recommendations 1.361 and 1.432. All serial processing in the 
ADSL frame (e.g. CRC, scrambling, etc.) shall, however, be performed LSB first, wiih the outside world MSB considered by 
the ADSL as LSB. As a result, the first incoming bit (outside world MSB) will be the first processed bit inside the ADSL 
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(ADSL LSB), and the CLP bit of the ATM cell header will be carried in the MSB of the ADSL frame byte (i.e., processed last). 
ADSL equipment shall support at least bearer channel LSO upstream. Two paths are shown between the Mux/Sync control and 
Tone ordering; the "fast" path provides low latency; the interleaved path provides very low error rate and greater latency. An 
ADSL system supporting ATM transport shall be capable of operating in a single latency mode, in which all user data is 
5 allocated to one path (i.e. fast or interleaved). An ADSL system supporting ATM transport may be capable of operating in an 
optional dual latency mode, in which user data is allocated to both paths (i.e. fast and interleaved). 
3.3 Transport Capacity. 

An ADSL system may transport up to seven user data streams on seven bearer channels simultaneously up to four 
independent downstream simplex bearers (unidirectional from the network operator (i.e. V-C interface) to the CI (i.e. T-R 
1 0 interface)) 

An ADSL system may transport up to three duplex bearers (bi-directional between the network operator and the CI). 

The three duplex bearers may alternatively be configured as independent unidirectional simplex bearers, and the 
rates of the bearers in the two directions (network operator toward CI and vice versa) do not need to match. 

All bearer channel data rates shall be programmable in any combination of integer multiples of 32 kbit/s. 
1 5 The ADSL data multiplexing format is flexible enough to allow other transport data rates, such as channelizations 

based on existing 1.544 Mbit/s, but the support of these data rates (non-integer multiples of 32 kbit/s) will be limited by the 
ADSL system's available capacity for synchronization. 

The maximum net data rate transport capacity of an ADSL system will depend on the characteristics of the loop on 
which the system is deployed, and on certain configurable options that affect overhead. The ADSL bearer channel rates shall be 
20 configured during the initialization and training procedure. 

The transport capacity of an ADSL system per se is defined only as that of the bearer channels. When, however, an 
ADSL system is installed on a line that also carries POTS or ISDN signals the overall capacity is that of POTS or ISDN plus 
ADSL 

A distinction is made between the transport of synchronous (STM) and asynchronous (ATM) data. An ATU-x shall 
25 be configured to support STM transmission or ATM transmission. Bearer channels configured to transport STM data can also 
be configured to carry ATM data. ADSL equipment may be capable of simultaneously supporting both ATM and STM 
transport. 

If an ATU-x supports a particular bearer channel it shall support it through both the fast and interleaved paths. 
In addition, an ADSL system may transport a Network Timing Reference (NTR). 
30 3.3.1. Transport of STM data 

ADSL systems transporting STM shall support the simplex bearer channel AS0 and the duplex bearer channel LSO 
downstream. Bearer channels AS0, LSO, and any other bearer channels supported shall be independently allocable to a 
particular latency path as selected by the ATU-C at start-up. The system shall support dual-latency downstream. 

ADSL systems transporting STM shall support the duplex bearer channel LSO upstream using a single latency path. 
35 Bearer channel AS0 shall support the transport of data at all integer multiples of 32 kbit/s from 32 kbit/s to 6144 kbit/s. 

Bearer channel LSO shall support 16 kbit/s and all integer multiples of 32 kbit/s from 32 kbit/s to 640 kbit/s. 

When AS1, AS2, AS3, LSI and LS2 are provided, they shall support the range of integer multiples of 32 kbit/s shown 
in Table 4. Support for data rates based on non-integer multiples of 32 kbit/s is also optional. 
Table 4 shows the required 32 kbit/s integer multiples for transport of STM. 
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Tabic 4. Required 32 kbit/s integer multiples for transport of STM 



6 Bearer channel 


Lowest Required Integer 


Largest Required 


Corresponding Highest 




Multiple 


Integer Multiple 


Required Data Rate 
(kbit/s) 


ASO 




192 


6144 


AS1 




144 


4608 


AS2 




96 


3072 


AS3 




48 


1536 


LSO 




20 


640 


LSI 




20 


640 


LS2 




20 


640 



Table 5 illustrates the data rate terminology and definitions used for STM transport. 

Table 5. Data Rate Terminology for STM transport 



Data Rate 


Equation (kbits/s) 


Reference Point 


STMdata rate - "Net data rate" 


E(B,,B F )X32 
(NOTE) 


ASx + LSx 


"Net data rate'* + Frame overhead rate = "Aggregate data 

rate" 


£(K I ,K F )X32 


A 


"Aggregate data + RS Coding overhead = "Total data rate" 
rate*' rate 


£(N,.N F )X32 


B 


"Total data rate" + Trellis Coding = Line rate 
overhead rate 


ZbiX4 


U 


NOTE - Net data rate increase by 16 kbit/s if a 16 kbit/s "C"-channel is used. 



3.3.2. Transport of ATM data 

An ADSL system transporting ATM shall support the single latency mode at all integer multiples of 32kbit/s up to 

6. !44 Mbit/s downstream and up to 640 kbit/s upstream. For single latency, ATM data shall be mapped to bearer channel ASO 

in the downstream direction and to bearer channel LS0 in the upstream direction. 
10 The need for dual latency for ATM services depends on the service/application profile, and is under study by the ITU. 

One of three different "latency classes" may be used: Single latency, not necessarily the same for each direction of 

transmission, Dual latency downstream, single latency upstream, Dual latency both upstream and downstream. 

ADSL systems transporting ATM shall support bearer channel ASO downstream and bearer channel LSO upstream, 

with each of these bearer channels independently allocable to a particular latency path as selected by the ATU-C at start-up. 
1 5 Therefore, support of dual latency is optional for both downstream and upstream. 

If downstream ATM data are transmitted through a single latency path (i.e., 'fast* only or Interleaved' only), only bearer 

channel ASO shall be used, and it shall be allocated to the appropriate latency path. If downstream ATM data are transmitted 

through both latency paths (i.e., Tasf and 'interleaved'), only bearer channels ASO and AS I shall be used, and they shall be 

allocated to different latency paths. 
20 Similarly, if upstream ATM data are transmitted through a single latency path (i.e., fast' only or 'interleaved' only), only 

bearer channel LSO shall be used and it shall be allocated to the appropriate latency path. The choice of the fast or interleaved 

path may be made independently of the choice for the downstream data. If upstream ATM data are transmitted through both 
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latency paths (i.e., 'fast' and 'interleaved'), only bearer channels LSO and LSI shall be used and they shall be allocated to 
different latency paths. 

Bearer channel ASO shall support the transport of data at all integer multiples of 32 kbit/s from 32 kbit/s to 6144 kbit/s. 
Bearer channel LSO shall support all integer multiples of 32 kbit/s from 32 kbit/s to 640 kbit/s. Support for data rates based on 
5 non-integer multiples of 32 kbit/s is also optional. 

When AS1 and LSI are provided, they shall support the range of integer multiples of 32 kbit/s shown in Table 4. 
Data rates based on non-integer multiples of 32 kbit/s is optional. 

Bearer channels AS2, AS3 and LS2 shall not be provided for an ATM based ATU-x. 

1 0 Table 6 illustrates the data rate terminology and definitions used for ATM transport. 

Table 6- Data Rate Terminology for ATM transport 



Data Rate 


Equation kbits/s) 


Ref Point 


53 x 8 x ATM cell rate 


"Net data rate" 


£(B 1 ,B F )X32 


ASx + LSx 


'"Net data rate" 


+ Frame overhead rate 


"Aggregate data 
rate'* 




£(K,.K f )X32 


A 


"Aggregate data rate" 


+ RS Coding overhead rate = 


"Total data rate" 


£(Ni.N f )X32 


B 


'Total data rate" 


+ Trellis Coding overhead = 
rate 


Line rate 




Eb,X4 


U 



3.3.3. ADSL system overheads and total bit rates 

The total bit rate transmitted by the ADSL system when operating in an optional reduced-overhead framing mode 
shall include capacity for the data rate transmitted in the ADSL bearer channels and ADSL system overhead (which includes; 
an ADSL embedded operations channel, EOC; an ADSL overhead control channel, AOC; CRC check bytes; fixed indicator bits 
for OAM; FEC redundancy bytes). When operating in the full-overhead mode the total bit rate shall also include capacity for 
the synchronization control bytes and capacity for bearer channel synchronization control. 

The internal overhead channels and their rates are shown in Table 7. 
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Table 7. Internal overhead channel functions and rates 





Downstream rate (kbit/s) 
nnnimum /maximum 


Upstream rate (kbit/s) 
minimum /maximum 




Number of ASx 
bearer channels 
> 1 


Number of ASx 
bearer channels 
= 1 


Number of LSx 
bearer channels 
> 1 


Number of LSx 
bearer channels 
= 1 


Synchronization control, 
CRC and AOC; 
interlc&vcd buffer 


32/32 


32/32 


32/32 


32/32 


Synchronization control, 
CRC, EOC and indicator 
bits* 

fast buffer 


32/32 


32/32 


32/32 


32/32 


Total for reduced 
overhead framing 


32/64 
(NOTE 2) 


32/64 
(NOTE 2) 


32/64 
(NOTE 2) 


32/64 
(NOTE 2) 


Synchronization capacity 
(shared among all bearer 
channels) 


64/128 
(NOTE 3) 


64/96 
(NOTE 3) 


32/64 
(NOTE 3) 


32/32 
(NOTE 3) 


Total 
(NOTE 1) 


128/ 192 


128/160 


96/128 


96/96 


NO IE 1 - The overhead required for FEC is not shown in this table. 

NOTE 2 - With the reduced overhead framing modes, a 32 kbit/s ADSL system overhead is present in each buffer type. 

However, when all ASx and LSx are allocated to one buffer type, synchronization control, CRC, EOC, AOC 
and indicator bits may be carried in a single 32 kbit/s ADSL system overhead present in the buffer type used. 
With full overhead framing, a 32 kbit/s ADSL system overhead is always present in each buffer type. 

MOTE 3 - The shared synchronization capacity includes 32 kbit/s shared among LSx within the interleave buffer, 32 
kbit/s shared among LSx within the fast buffer, 32 kbit/s shared among ASx within the interleave buffer, and 
32 kbit/s shared among ASx within the fast buffer. The maximum rate occurs when at least one ASx is 
allocated to each type of buffer, the minimum rate occurs when all ASx and LSx are allocated to one buffer 
type. 



3.4. ATU-C Functional Characteristics. 

An ATU-C may support STM transmission or ATM transmission or both. Framing modes that shall be supported, 
depend upon the ATU-C being configured for either STM or ATM transport. If framing mode k is supported, then modes k-1 , 
0 shall also be supported.. 

During initialization, the ATU-C and ATU-R shall indicate a framing mode number 0, 1, 2 or 3, which they intend to 
use. The lowest indicated framing mode shall be used. 

Using framing mode 0 ensure than an STM based ATU-x with an external ATM TC will interoperate with an ATM 
based ATU-x, Additional modes of interoperation are possible depending upon optional features provided in either ATU-x. 

An ATU-C may provide a Network Timing Reference (NTR). This operation shall be independent of any clocking 
that is internal to the ADSL svstem. 
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3.4. 1. STM Transmission Protocol Specific functionalities 

3.4. 1.1. ATU-C input and output V interfaces for STM transport 

The functional data interfaces at the ATU-C for STM transport are shown in Figure 57. - Input interfaces for the high- 
speed downstream simplex bearer channels are designated ASO through AS3; input/output interfaces for the duplex bearer 
5 channels are designated LSO through LS2. There shall also be a duplex interface for operations, administration, maintenance 
(OAM) and control of the ADSL system. 

3.4.1.2. Downstream simplex channels 

Four data input interfaces are defined at the ATU-C for the high-speed downstream simplex channels: ASO, AS1, 
AS2 and AS3 (ASx in general). 
10 3.4. 1 .3. Downstream/upstream duplex channels 

Three input and output data interfaces are defined at the ATU-C for the duplex channels supported by the ADSL 
system; LSO, LSI, and LS2 (LSx in general). LSO is also known as the "C" or control channel. It carries the signaling 
associated with the ASx bearer channels and it may also carry some or all of the signaling associated with the other duplex 
bearer channels. 
15 3.4. 1 .4. Pavload transfer delay 

The one-way transfer delay for payload bits in all bearers (simplex and duplex) from the V reference point at central 
office end (V-C) to the T reference point at remote end (T-R) for channels assigned to the fast buffer shall be no more than 2 
ms. For channels assigned to the interleave buffer it shall be no more than (4 + (S~l)/4 +SxD/4) ms. The same requirement 
applies in the opposite direction, from the T-R reference point to the V-C reference point. 
20 3.4. 1 .5. _ Framing Structure for STM transport 

An ATU-C configured for STM transport shall support the full overhead framing structure O.The support of full 
overhead framing structure I and the reduced overhead framing structures 2 and 3 is optional. Preservation of V-C interface 
byte boundaries (if present) at the U-C interface may be supported for any of the U-C interface framing structures.An ATU-C 
configured for STM transport may support insertion of a Network Timing Reference (NTR). 
25 3.4.2. ATM Transmission Protocol Specific functionalities 

3.4.2 ,1. ATU-C input and output V interface for ATM transport 

The functional data interfaces at the ATU-C for ATM is shown in Figure 58. The ATM channel ATM0 shall always 
be provided, the channel ATM1 may be provided for support of dual latency mode. Each channel operates as an interface to a 
physical layer pipe. When operating in dual latency mode, no fixed allocation between the ATM channels 0 and 1 on one hand 
30 and transport of 'fast and 'interleaved' data on the other hand is assumed. This relationship is configured inside the ATU-C. 

Flow control functionality shall be available on the V reference point to allow the ATU-C (i.e. the physical layer) to 
control the cell flow to and from the ATM layer. This functionality is represented by Tx_Cell__Handshake and 
Rx_Cell_Handshake, A cell may be transferred from ATM to PHY layer only after the ATU-C has activated the 
Tx_Cell_Handshake. Similarly a cell may be transferred from the PHY layer to the ATM layer only after the 
35 Rx_Cell_Handshake. This functionality is important to avoid cell overflow or underflow in die ATU-C and ATM layers. 

There shall also be a duplex interface for operations, administration, maintenance (OAM) and control of the ADSL 

system. 

3.4.2.2 Pavload transfer delay 

The one-way transfer delay (excluding cell specific functionalities) for payload bits in all bearers (simplex and 
40 duplex) from the V reference point at central office end (V-C) to the T reference point at remote end (T-R) for channels 
assigned to the fast buffer shall be no more than 2 ms. 

For channels assigned to the interleave buffer it shall be no more than (4 + (S~I)/4 +S\D/4) ms. The same 
requirement applies in the opposite direction, from the T-R reference point to the V-C reference point. 
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3.4.2.3, ATM Cell specific functionalities 
3.4.2.3.1. Idle Cell Insertion 

Idle cells shall be inserted in the transmitter direction for cell rate de-coupling. Idle cells are identified by the 
standardized pattern for the cell header given in ITU-T Recommendation 1.432. 
5 3.4.2.3.2. Header Error Control (HEO Generation. 

The HEC byte shall be generated in the transmit direction as described in ITU-T Recommendation 1.432, including 
the recommended modulo 2 addition (XOR) of the pattern 0 10 1 0 10 1 2 to the HEC bits. The generator polynomial coefficient set 
used and the HEC sequence generation procedure shall be in accordance with ITU-T Recommendation 1.432. 

3.4.2.3.3. Cell pavload scrambling. 

1 0 Scrambling of the cell payload field shall be used in the transmit direction to improve the security and robustness of 

the HEC cell delineation mechanism. In addition, it randomizes the data in the information field, for possible improvement of 
the transmission performance. The self synchronizing scrambler polynomial X 43 +l and procedures defined in ITU-T 
Recommendation 1.432 shall be implemented. 

3.4.2.3.4. Bit timing and ordering 

1 5 When interfacing ATM data bytes to the AS0 or AS1 bearer channel, the most significant bit (MSB) shall be sent 

first. The AS0 or AS 1 bearer channel data rates shall be integer multiples 32 kbit/s, with bit timing synchronous with the 
ADSL downstream timing base. 

3.4.2.3.5 Cell Delineation. 

The cell delineation function permits the identification of cell boundaries in the payload. It uses the HEC field in the 
20 cell header. Cell delineation shall be performed using a coding law checking the HEC field in the cell header according to the 
algorithm described in ITU-T Recommendation 1.432. The ATM cell delineation state machine is shown in Figure 59. 

In the HUNT state, the delineation process is performed by checking bit by bit for the correct HEC. Once such an 
agreement is found, it is assumed that one header has been found, and the method enters the PRESYNC state. When byte 
boundaries are available within the receiving Physical Layer prior to cell delineation as with the framing modes I, 2 and 3, the 
25 cell delineation process may be performed byte by byte. In the PRESYNC state, the delineation process is performed by 
checking cell by cell for the correct HEC. The process repeats until the correct HEC has been confirmed DELTA times 
consecutively. If an incorrect HEC is found, the process returns to the HUNT state. In the SYNC state the cell delineation will 
be assumed to be lost if an incorrect HEC is obtained ALPHA times consecutively. (With reference to ITU-T Recommendation 
1.432, no recommendation is made for the values of ALPHA and DELTA as the choice of these values is not considered to effect 
30 interoperability. However, it should be noted that the use of the values suggested in ITU-T Recommendation 1.432 (ALPHA=1, 
DELTA=6) may be inappropriate due to the particular transmission characteristics of ADSL). 

3.4.2.3.6 Header Error Control Verification 

The HEC covers the entire cell header. The code used for this function is capable of either: single bit error correction 
or multiple bit error detection. Error detection shall be implemented as defined in ITU-T Recommendation 1.432 with the 
35 exception that any HEC error shall be considered as a multiple bit error, and therefore, HEC error correction shall not be 
performed. 

3 .4 . 2 .4 Framing Structure for ATM transport 

An ATU-C configured for ATM transport shall support the full overhead framing structures 0 and 1 . The support of 
reduced overhead framing structures 2 and 3 is optional. The ATU-C transmitter shall preserve V-C interface byte boundaries 
40 (explicitly present or implied by ATM cell boundaries) at the U-C interlace, independent of the U-C interface framing 
structure. 

To ensure framing structure 0 interoperability between an ATM ATU-C and an ATM cell TC plus an STM ATU-R 
(i.e., ATM over STM), transporting ATM cells and not preserving T-R byte boundaries at the U-R interface shall indicate 
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during initialization that frame structure 0 is the highest frame structure supported. An STM ATU-R transporting ATM cells 
and preserving T-R byte boundaries at the U-R interface shall indicate during initialization that frame structure 0, 1, 2 or 3 is 
the highest frame structure supported. An ATM ATU-C receiver operating in framing structure 0 can not assume that the 
ATU-R transmitter will preserve T-R interface byte boundaries at the U-R interface and shall therefore perform the cell 
delineation bit-by-bit. 

An ATU-C configured for ATM transport may support insertion of a Network Timing Reference (NTR). 
3 .4 . 3 Network timing reference CNTR) 

3.4.3.1 Need for NTR 

Some services require that a reference clock be available in the higher layers of the protocol stack (i.e. above the 
physical layer); this is used to guarantee end-to-end synchronization of transmit and receive sides. Examples are Voice and 
Telephony Over ATM (VTOA) and Desktop Video Conferencing (DVC). 

To support the distribution of a timing reference over the network, the ADSL system may transport an 8 kHz timing 
marker as NTR. This 8 kHz timing marker may be used for voice/video playback at the decoder (D/A converter) in DVC and 
VTOA applications. The 8 kHz timing marker is input to the ATU-C as part of the interface at the V-C reference point. 

3 .4. 3.2 Transport of the NTR 

The intention of the NTR transport mechanism is that the ATU-C provides timing information at the U-C reference 
point to enable the ATU-R to deliver to the T-R reference point timing information that has a timing accuracy corresponding to 
the accuracy of the clock provided to the V-C reference point. If provided, the NTR shall be inserted in the U-C framing 
structure as follows: 

a) The ATU-C may generate an 8 kHz local timing reference (LTR) by dividing its sampling clock by the appropriate 
integer (276 if 2.208 MHz is used); 

b) It shall transmit the change in phase ofTset between the input NTR and LTR (measured In cycles of the 2.208 MHz 
clock, that is units of approximately 452 ns) from the previous superframe to the present one. This shall be encoded 
into four bits ntr3 - ntrO (with ntr3 the MSB), representing a signed integer in the -8 to +7 range in 2's-complement 
notation. The bits ntr3-ntr0 shall be carried in the indicator bits 23 (ntr3) to 20 (ntrO); see Table 9. 

c) A positive value of the change of phase offset, D 2 f, shall indicate that the LTR is higher In frequency than the 
NTR. 

d) Alternatively, the ATU-C may choose to lock its downstream sampling clock (2.208 MHz) to 276 times the NTR 
frequency, in that case it shall encode A 2 f to zero. 

The NTR, as specified by ANSI Standard Tl. 101, has a maximum frequency variation of ±32 ppm. The LTR has a 
maximum frequency variation of ±50 ppm. The maximum mismatch is therefore ±82 ppm. This would result in an average 
change of phase offset of approximately ±3.5 clock cycles over one 17 ms superframe, which can be mapped into 4 overhead 
bits. 

One method that the ATU-C may use to measure this change of phase offset is shown in Figure 60. 
3.4.4 Framing 

This subclause specifies framing of the downstream signal (ATU-C transmitter). Two types of framing are defined: 
full overhead and reduced overhead. Furthermore, two versions of full overhead and two versions of reduced overhead are 
defined. The four resulting framing modes are defined in Table 8, and shall be referred to as framing modes 0, 1 , 2 and 3. 
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Table 8- Definition of framing modes 



Framing structure 



Description 



0 



Full overhead Gaming with asynchronous bit-to-modem timing 
(i.e. enabled synchronization control mechanism) 



Full overhead framing with synchronous bit-to-modem timing 
(i.e. disabled synchronization control mechanism) 



2 



Reduced overhead framing with separate fast and sync byte in fast and interleaved latency 
buffer respectively (i.e. 64 kbit/s framing overhead) 



3 



Reduced overhead framing with merged fast and sync byte, using either the fast or the 
interleaved latency buffer (i.e. 32 kbit/s framing overhead) 



Requirements for framing modes to be supported, depend upon the ATU-C being configured for either STM or ATM transport. 
The ATU-C shall indicate during initialization the highest frarning structure number it supports. If the ATU-C indicates it 
supports framing structure k, it shall also support all frarning structures k-\ to 0. If the ATU-R indicates a lower framing 
structure number during initialization, the ATU-C shall fall back to the framing structure number indicated by the ATU-R. 
Outside the ASx/LSx serial interfaces data bytes are transmitted MSB first in accordance with ITU-T Recommendations G.703, 
G.709, 1.361, and 1.432. All serial processing in the ADSL frame (e.g., CRC, scrambling, etc.) shall, however, be performed 
LSB first, with the outside world MSB considered by the ADSL as LSB. As a result, the first incoming bit (outside world 
MSB) will be the first processed bit inside the ADSL (ADSL LSB). 
3.4.4.1 Data symbols 

Figures 53 and 54 show functional block diagrams of the ATU-C transmitter with reference points for data framing. 
Up to four downstream simplex data channels and up to three duplex data channels shall be synchronized to the 4 kHz ADSL 
DMT frame rate, and multiplexed into two separate data buffers (fast and interleaved). A cyclic redundancy check (CRC), 
scrambling, and forward error correction (FEC) coding shall be applied to the contents of each buffer separately, and the data 
from the interleaved buffer shall then be passed through an interleaving function. The two data streams shall then be tone 
ordered, and combined into a data symbol that is input to the constellation encoder. After constellation encoding the data shall 
be modulated to produce an analog signal for transmission across the customer loop. 

A bit-level framing pattern shall not be inserted into the data symbols of the frame or superframe structure. DMT 
frame (i.e. symbol) boundaries are delineated by the cyclic prefix inserted by the modulator. Superframe boundaries are 
determined by the synchronization symbol and shall also be inserted by the modulator, and which carries no user data. 

Because of the addition of FEC redundancy bytes and data interleaving, the data frames (i.e. bit-level data prior to 
constellation encoding) have different structural appearance at the three reference points through the transmitter. As shown in 
Figures 53 and 54, the reference points for which data framing will be described in the following subclauses is: 

a) A (Mux data frame): the multiplexed, synchronized data after the CRC has been inserted 

b) B (FEC output data frame): the data frame generated at the output of the FEC encoder at the DMT symbol rate, 
where an FEC block may span more than one DMT symbol period 

c) C (constellation encoder input data frame): the data frame presented to the constellation coder. 
3.4.4. 1 . 1 Su perframe structure 

ADSL uses the superframe structure shown in Figure 61. Each superframe is composed of 68 data frames, numbered 
from 0 to 67, which are encoded and modulated into DMT symbols, followed by a synchronization symbol, which carries no 
user or overhead bit-level data and is inserted by the modulator to establish superframe boundaries. From the bit-level and 
user data perspective, the DMT symbol rate is 4000 baud (period = 250 us), but in order to allow for the insertion of the 
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10 



15 



20 



synchronization symbol the transmitted DMT symbol rate is 69/68*4000 baud. Each data frame within the superframe 
contains data from the fast buffer and the interleaved buffer. During each ADSL superframe, eight bits shall be reserved for the 
CRC on the fast data buffer (crc0-crc7), and 24 indicator bits (ib0-ib23) shall be assigned for OAM functions. As shown in 
Figure 62, the synchronizaton byte of the fast data buffer ("fast byte") carries the CRC check bits in frame 0 and the indicator 
bits in frames 1, 34, and 35. The fast byte in other frames is assigned in even-/odd-frame pairs to either the EOC or to 
synchronization control of the bearer channels assigned to the fast buffer. 

Bit 0 of the fast byte in an even-numbered frame (other than frames 0 and 34) and bit 0 of the fast byte of the odd- 
numbered frame immediately following shall be set to M 0 W to indicate these frames carry a synchronization control information. 
When they are not required for synchronization control, CRC. or indicator bits, the fast bytes of two successive ADSL frames, 
beginning with an even-numbered frame, may contain indications of "no synchronization action" or alternatively, they may be 
used to transmit one EOC message, consisting of 1 3 bits. The indicator bits are defined in Table 9. Bit 0 of the fast byte in an 
even-numbered frame (other than frames 0 and 34) and bit 0 of the fast byte of the odd-numbered frame immediately following 
shall be set to "P to indicate these frames carry a 13-bit EOC message plus one additional bit, rl. The rl bit is reserved for 
future use and shall be set to 1 . 

Table 9 - Definition of indicator bits. ATU-C transmitter (fast data buffer, downstream direction > 



Indicator bit 


Defmition (see NOTE) 


ibO - ib7 


reserved for future use 


ib8 


FEBE-I 


ib9 


FECC-I 


iblO 


FEBE-F 


ibll 


FECC-F 


ib!2 


LOS 


ib!3 


RDI 


ibl4 


NCD-I (used for ATM only, shall be set to 1 for STM) 


ibl5 


NCD-F (used for ATM only, shall be set to 1 for STM) 


ibl6 


HEC-I (used for ATM only, shall be set to 1 for STM) 


ibl7 


HEC-F (used for ATM only, shall be set to 1 for STM) 


ibl8- 19 


reserved for future use 


ib20-23 


NTKO - 3 (if NTR is not transported, ib20-23 shall be set to 1 - 
they are active low) 


NOTE - Because all indicator bits are defined as active low, reserved bits shall be set to 1. 



Eight bits per ADSL superframe shall be used for the CRC on the interleaved data buffer (crcO - crc7). As shown in 
Figure 63 and Figure 65, the synchronization byte of the interleaved data buffer ("sync byte") carries the CRC check bits for 
the previous superframe in frame 0. In all other frames ( 1 through 67), the sync byte shall be used for synchronization control 
of the bearer channels assigned to the interleaved data buffer or used to carry an ADSL overhead control (AOC) channel. In 
the full overhead mode, when any bearer channel appears in the interleave buffer, then the AOC data shall be carried in the 
LEX byte, and the sync byte shall designate when the LEX byte contains AOC data and when it contains data bytes from the 
bearer channel. When no bearer channels are allocated in the interleaved data buffer (i.e., all Bj(ASx) = Bj(LSx) - 0), the sync 
byte shall earn' the AOC data directly. 
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3.4 .4 1.2 Frame structure (with fiill overhead') 

Each data frame shall be encoded into a DMT symbol. As is shown in Figure 61, each frame is composed of a fast 
data buffer and an interleaved data buffer, and the frame structure has a different appearance at each of the reference points (A, 
B, and C). The bytes of the fast data buffer shall be clocked into the constellation encoder first, followed by the bytes of the 
5 interleaved data buffer. Bytes are clocked least significant bit first. 

Each bearer channel shall be assigned to either the fast or the interleaved buffer during initialization, and a pair of 
bytes, {B F Bft, transmitted for each bearer channel, where B F and Bj designate the number of bytes allocated to the fast and 
interleaved buffers, respectively. 

The seven [B F Bj] pairs to specify the downstream bearer channel rates are: B F (ASx). Bj(ASx) for X = 0, 1, 2 and 3, 
10 for the downstream simplex channels: B F (LSx). Bj(LSx) for X = 0, 1 and 2, for the (downstream transport of the) duplex 
channels. 

The rules for allocation are as follow: 

• For any bearer channel, X, (except the 1 6 kbit/s C channel option) either Bp(X) « the number of bytes per frame 
of the fast buffer and 5/ (X) = 0. or BrfX) = 0 and B\(K) - the number of bytes per frame of the interleaved 

1 5 buffer. 

• For the 16 kbit/s C channel option, BptLSO) - 255 (UlllUU and Bj(LS0) =0, or B F (LS0) - 0 and Bj(LS0) - 
255, 

3.4.4. 1 .2. 1 Fast data buffer (with full overhead^ 

The frame structure of the fast data buffer shall be as shown in Figure 64, for reference points A and B, which are 
20 defined in Figure 53 and 54. 

The following shall hold for the parameters shown in Figure 64: 
CrfLSO) = 0 ifBpiLSO) - 255 (UlUUlj 

= Bp(LS0) otherwise 
N F =K F + R F 
25 where R F = number of FEC redundancy bytes, and 

K F = ' + X, BF&SO + A F + CtfLSO) + B F^J) + L F 



where 



= 0 if^ BrfASi) =0fori=0-3 



j=0 

— 1 otherwise 



30 and 



Lf =0 ifB/ASi) = 0 for i=0-3 and BjfLSj) = Ofor j^0-2 
= 1 otherwise (including BtfLSO) = 255) 

At reference point A (Mux data frame) in Figure 53 and Figure 54 the fast buffer shall always contain at least the fast 
byte. This is followed by BffASO) bytes of channel AS0, then BpfASI) bytes of channel AS1. B F (AS2) bytes of channel AS2 
35 and Bpr(AS3) bytes of channel AS3. Next come the bytes for any duplex (LSx) channels allocated to the fast buffer. If any 
B F (ASx) is non-zero, then both an AEX and a LEX byte follow the bytes of the last LSx channel, and if any Bjr(LSx) is non-zero, 
the LEX byte shall be included. 
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When Bf?(LS0) = 255, no bytes are included for the LSO channel. Instead, the 16 kbit/s C channel shall be 
transported in every other LEX byte on average, using the sync byte to denote when to add the LEX byte to the LSO bearer 
channel. 

Rf FEC redundancy bytes shall be added to the mux data frame (reference point A) to produce the FEC output data 
5 frame (reference point B), where Rp is given in the options used during initialization. 

Because the data from the fast buffer is not interleaved, the constellation encoder input data frame (reference point C) 
is identical to the FEC output data frame (reference point B). 
3.4.4.1.2.2. Interleaved data buffer (with full overhead) 

The frame structure of the interleaved data buffer is shown in Figure 65 for reference points A and B, which are 
1 0 defined in Figure 53 and 54. 

The following shall hold for the parameters shown in Figure 65 
Cj(LS0)= 0 ifBj(LS0) = 255 (lUlUllj 

= Bj(LS0) otherwise 
Nj = (SxKj + Ri)/S. 

1 5 where Rj = number of FEC redundancy bytes and S = number of DMT symbols per FEC codeword. 

3 2 

Kj = 1 + ]T Bj(ASi) +Aj + Cj(LS0) + ]T Bj(LSj) + L I 



7=1 



where 



A I =0 B l( ASi ) ~° 

i=0 

= 1 otherwise 



20 and 



Lj =0ifBi(ASi) ~0for i=0-3 and Bi(LSj) = 0forj=0-2 

= 1 otherwise (including Bi(LS0) = 255) 
At reference point A, the Mux data frame, the interleaved data butler shall always contain at least the sync byte. The 
rest of the buffer shall be built in the same manner as the fast buffer, substituting B\ in place of Bp. The length of each mux 
25 data frame is A'j bytes, as defined in Figure 65. 

The FEC coder shall take in S mux data frames and append flj FEC redundancy bytes to produce the FEC codeword 
of length Nfec =5 * A'| + /?j bytes. The FEC output data frames shall contain N\ = Nfec IS bytes, where N\ is an integer. 
When S > 1 , then for the S frames in an FEC codeword, the FEC output Data Frame (reference point B) shall partially overlap 
two mux data frames for all except the last frame, which shall contain the flj FEC redundancy bytes. 
30 The FEC output data frames are interleaved to a specified interleave depth. The interleaving process delays each 

byte of a given FEC output data frame a different amount, so that the constellation encoder input data frames will contain bytes 
from many different FEC data frames. At reference point A in the transmitter, mux data frame 0 of the interleaved data buffer 
is aligned with the ADSL superframe and mux data frame 0 of the fast data buffer (this is not true at reference point C). At the 
receiver, the interleaved data buffer will be delayed by (S ' interleave depth * 250) ms with respect to the fast data buffer, and 
35 data frame 0 (containing the CRC bits for the interleaved data buffer) will appear a fixed number of frames after the beginning 
of the receiver superframe. 



52 



WO 00/07323 PCT/US99/1 7369 



3.4.4. 1.3 Cyclic redundancy check (CRC^ 

Two cyclic redundancy checks (CRCsH>ne for the fast data buffer and one for the interleaved data buffer-shall be 
generated for each superframe and transmitted in the first frame of the following supertrame. Eight bits per buffer type (fast or 
interleaved) per superframe allocated to the CRC check bits. These bits are computed from the k message bits using the 
equation: 



where 



crc(D) = M(D) rfi modulo G(D), 
M(D) - mflD*-' + mjD^' 2 + .... +- m/i^D + m^j 



is the message polynomial, 

10 G(D) = D 8 + D 4 +D 3 + D 2 + 1 

is the generating polynomial, 

crc(D) = cqD 7 + cjD 6 + + c&D + c7 

is the check polynomial, and D is the delay operator. That is, CRC is the remainder when M(D) D 8 is divided by G(D). The 
CRC check bits are transported in the synchronization bytes (fast and interleaved; 8 bits each) of frame 0 for each data buffer. 
1 5 The bits (i.e. message polynomials) covered by the CRC include: 

Fast data buffer: 

■ frame 0: ASx bytes (X = 0, 1, 2, 3), LSx bytes (X = 0, 1, 2), 
followed by any AEX and LEX bytes. 

■ all other frames: fast byte, followed by ASx bytes (X = 0, 1, 2, 3), 
20 LSx bytes (X = 0, 1, 2), and any AEX and LEX bytes. 

Interleaved data buffer 

■ frame 0: ASx bytes (X = 0, 1 , 2, 3), LSx bytes (X = 0, 1,2), 
followed by any AEX and LEX bytes. 

■ all other frames: sync byte, followed by ASx bytes {X = 0, 1,2,3), 
25 LSx bytes (X = 0, 1 , 2), and any AEX and LEX bytes. 

Each byte shall be clocked into the CRC least significant bit first. 

The number of bits over which the CRC is computed varies with the allocation of bytes to the fast and interleaved 
data buffers (the numbers of bytes in ASx and LSx vary according to the [Bjr.Bj] pairs; AEX is present in a given buffer only if 
at least one ASx is allocated to that buffer, LEX is present in a given buffer only if at least one ASx or one LSx is allocated to 
30 that buffer). 

Because of the flexibility in assignment of bearer channels to the fast and interleaved data buffers, CRC field lengths 
over an ADSL superframe will vary from approximately 67 bytes to approximately 14,875 bytes. 
3.4.4.2 Synchronization 

If the bit timing base of the input user data streams is not synchronous with the ADSL modem timing base, the input 
35 data streams shall be synchronized to the ADSL timing base using the synchronization control mechanism (consisting of 
synchronization control byte and the AEX and LEX bytes). Forward-error-correction coding shall always be applied to the 
synchronization control byte(s). 

If the bit timing base of the input user data streams is synchronous with the ADSL modem timing base, then the 
synchronization control mechanism is not needed, and the synchronization control byte shall always indicate "no 
40 synchronization action" (see Table 10 and Table 1 1 ). 
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3.4.4.2.1. S vnchronization for the fast data buffer, 

Synchronization control for the fast data buffer may occur in frames 2 through 33, and 36 through 67 of an ADSL 
superframe, where the fast byte may be used as the synchronization control byte. No synchronization action shall be taken for 



those frames for which the fast byte is used for CRC, fixed indicator bits, or EOC. 

The format of the fast byte when used as synchronization control for the fast data buffer shall be as given in Table 10. 
Table 10- Fast bvte for mat for synchronization 



Bits 






sc7, sc6 


ASx bearer channel designator 


"OOj" : AS0 
"Oh" :AS1 
■W: AS2 
"Hi" : AS3 


sc5, sc4 


Synchronization control for the 
designated ASx bearer channel 
(see note 3) 


"OOj*' : no synchronization action 

"0 1 2 " : add AEX byte to designated ASx bearer channel 
"1 1 2 " : add AEX and LEX bytes to ASx bearer channel 
"10 2 " : delete last byte from designated ASx bearer channel 


sc3, sc2 


LSx bearer channel designator 
(see note 3) 


"002*' : LS0 
"01 2 M :LS1 
'W :LS2 

" 1 h" : no synchronization action 


scl 


Synchronization control for the 
designated LSx bearer channel 


" 1 7 " '■ add LEX byte to designated LSx bearer channel 
"0 2 " : delete last byte from designated LSx bearer channel 


scO 


Synchronization/EOC designator 


"0j" : perform synchronization control as indicated in sc7-sc 1 
" 1 2" : this byte is part of an EOC frame 



ADSL deployments may need to inter-work with DS1 (1.544 Mbit/s) or DS1C (3.152 Mbit/s) rates. The 
synchronization control option that allows adding up to two bytes to an ASx bearer channel provides sufficient overhead 
10 capacity to transport combinations of DS1 or DS1C channels transparently (without interpreting or stripping and regenerating 
the framing embedded within the DS1 or DS1C). The synchronization control algorithm shall, however, guarantee that the fast 
byte in some minimum number of frames is available to carry EOC frames, so that a minimum EOC rate (4 kbit/s) may be 
maintained. 

When the data rate of the C channel is 16 kbit/s, the LS0 bearer channel is transported in the LEX byte, using the 
1 5 "add LEX byte to designated LSx charmer', with LS0 as the designated channel, every other frame on average. 

If the bit timing base of the input bearer channels (ASx, LSx) is synchronous with the ADSL modem timing base, 
then ADSL systems need not perform synchronization control, (by adding or deleting AEX or LEX bytes to/from the designated 
ASx and LSx channels). In this case, the synchronization control byte shall indicate "no synchronization action" (i.e., sc7-0 
coded "XX001 1X0 2 '\ with X discretionary). 
20 3.4.4.2.2. Synchronization for the interleaved data buffer 

Synchronization control for the interleaved data buffer can occur in frames 1 through 67 of an ADSL superframe, 
where the sync byte may be used as the synchronization control byte. No synchronization action shall be taken during frame 0, 
where the sync bvte is used for CRC during frames when the LEX byte carries the AOC. 
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The format of the sync byte when used as synchronization control for the interleaved data buffer shall be as given in 
Table 11. In the case where no signals are allocated to the interleaved data buffer, the sync byte shall carry the AOC data 
directly, as shown in Figure 63. 

Table 1 1 - S ync byte format for synchronization 



Bits 


i giui u on 


Codes 


sc7, sc6 


ASx bearer channel designator 


"OOr : ASO 
"OH" : AS1 
"10 2 H : AS2 
" 1 1 2" : AS3 


sc5, sc4 


Synchronization control for the 
designated ASx bearer channel 
(see note 3) 


"00 2 " : no synchronization action 

M 0 1 : add AEX byte to designated ASx bearer channel 
"lb": add AEX and LEX bytes to ASx bearer channel 
"10 2 " : delete last byte from designated ASx bearer channel 


sc3, sc2 


LSx bearer channel designator 


"00 2 H : LSO 
"Oh" : LSI 
"IO2" :LS2 

"1 h" : no synchronization action 


scl 


Synchronization control for the 
designated LSx bearer channel 


" 1 2" : add LEX byte to designated LSx bearer channel 
"O2" : delete last byte from designated LSx bearer channel 


scO 


Synchronization/AOC designator 


"0 2 " : perform synchronization control as indicated in sc7-scl 
" I2" : LEX byte carries ADSL overhead control channel data; 

synchronization control is allowed for "add AEX" or "delete" as 

indicated in sc7-scl 



10 



15 



20 



ADSL deployments may need to inter-work with DS1 (1.544 Mbit/s) or DS1C (3.152 Mbit/s) rates. The 
synchronization control option that allows adding up to two bytes to an ASx bearer channel provides sufficient overhead 
capacity to transport combinations of DS1 or DS1C channels transparently (without interpreting or stripping and regenerating 
the framing embedded within the DS1 or DS1C). 

When the data rate of the C channel is 16 kbit/s, the LSO bearer channel is transported in the LEX byte, using the 
"add LEX byte to designated LSx channel", with LSO as the designated channel, every other frame on average. 

If the bit timing base of the input bearer channels (ASx, LSx) is synchronous with the ADSL modem timing base then 
ADSL systems need not perform synchronization control by adding or deleting AEX or LEX bytes to/from the designated ASx 
and LSx channels. In this case, the synchronization control byte shall indicate "no synchronization action". In this case, when 
framing mode 1 is used, the sc7-0 shall always be coded "XX001 1XX 2 ", with X discretionary. When scO is set to 1, the LEX 
byte shall carry AOC. When scO is set to 0, the LEX byte shall be coded 00, 6 . The scO may be set to 0 only in between 
transmissions of 5 concatenated and identical AOC messages. 
3.4.4.3 Reduced overhead framing 

The format described for full overhead framing includes overhead to allow for the synchronization of the seven ASx 
and LSx bearer channels. When the synchronization function is not required, the ADSL equipment may operate in a reduced 
overhead mode. This mode retains all the full overhead mode functions except synchronization control. 
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3-4,4.3.1 Reduced overhead fram ing with separate fast and svnc bytes 

The AEX and LEX bytes shall be eliminated from the ADSL frame format, and both the fast and sync bytes shall 
carry overhead information. The fast byte carries the fast buffer CRC, indicator bits, and HOC messages, and the sync byte 
carries the interleaved buffer CRC and AOC message. The assignment of overhead functions to fast and sync bytes when using 
the full overhead framing and when using the reduced overhead framing with separate fast and sync bytes shall be as shown in 
Table 12. 

In the reduced overhead framing with separate fast and sync bytes, the structure of the fast data buffer shall be as 
shown in Figure 64 with A F and L F set to 0. The structure of the interleaved data buffer shall be as shown in Figure 65 with Aj 



and Lj set to 0. 



Table 12 - Overhead functions for framing modes 





Full Overhead Mode 


Reduced Overhead Mode 


Frame Number 


Fast Sync 


Interleave Sync 


Fast Sync 


Interleave Sync 


0 


fast CRC 


Interleaved CRC 


fast CRC 


interleaved CRC 


1 


L30-7 


sync or AOC 


IBO-7 


AOC 


34 


IBS- 15 


sync or AOC 


IB 8- 15 


AOC 


35 


IB 16-23 


sync or AOC 


IB16-23 


AOC 


all other frames 


sync or EOC 


sync or AOC 


sync or EOC 
(see NOTE) 


AOC 



NOTE - In the reduced overhead mode only "no synchronization action" code shall be used. 
3.4.4.3.2 Reduced overhead framing with merged fast and svnc bvtes 

In the single latency mode, data is assigned to only one data buffer (fast or interleaved). If data is assigned to only the 
fast buffer, then only the fast byte shall be used to carry overhead information. If data is assigned only to the interleaved 
buffer, then only the sync byte shall be used to carry overhead information. Reduced overhead framing with merged fast and 
sync bytes shall not be used when operating in dual latency mode. 

For ADSL systems transporting data using a single data buffer (fast or interleaved), the CRC, indicator, EOC and 
AOC function shall be carried in a single overhead byte assigned to separate data frames within the superframe structure. The 
CRC remains in frame 0 and the indicator bits in frames 1, 34, and 35. The AOC and EOC bytes are assigned to alternate 
pairs of frames. For ADSL equipment operating in single latency mode using the reduced overhead framing with merged fast 
and sync bytes, the assignment of overhead functions shall be as shown in Table 13. 

In the single latency mode using the reduced overhead framing with merged fast and sync bytes, only one data buffer 
shall be used. If the fast data buffer is used, the structure of the fast data buffer shall be as shown in Figure 64 (with Ap and Lp 
set to 0) and the interleaved data buffer shall be empty (no sync byte and Kj = 0). If the interleaved data buffer is used, the 
structure of the interleaved data buffer shall be as shown in Figure 65 (with Aj and Lj set to 0) and the fast data buffer shall be 
empty (no fast byte and Kp = 0). 
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Table 1 3- Overhead functions for reduced overhead mode with merged fast and svnc bytes 



Frame Number 


(Fast Buffer Only) 
Fast Byte format 


(Interleaved Buffer Only) 
Sync Byte format 


0 


Fast CRC 


Interleaved CRC 


1 


IBO-7 


IBO-7 


34 


IB8-15 


IB8-15 


35 


B316-23 


IBI6-23 


4n+2, 4n+3 
withn = 0...16, n*8 


EOC or sync (see NOTE) 


EOC or sync (see NOTE) 


4n, 4n+l 
with n = 0...!6, n*0 


AOC 


AOC 


NOlli - In the reduced overhead mode only the t( no synchronization action" code shall be used. 



The binary data streams output (LSB of each byte tirst) from the fast and interleaved data buffers shall be scrambled 
separately using the following algorithm for both: 

d' n = d n © £/'„.i8 e d' n . 7i 

where d n is the n-th output from the fast or interleaved buffer (i.e., input to the scrambler), and d n * is the «-th output from the 
corresponding scrambler. This is illustrated in Figure 66. 

These scramblers are applied to the serial data streams without reference to any framing or symbol synchronization. 
Descrambling in receivers can likewise be performed independent of symbol synchronization. 
3.4.6 Forward error correction 

The ATU-C shall support downstream transmission with at least any combination of the FEC coding capabilities 
shown in Table 14. 



Table 14 - Minimum FEC coding capabilities for ATU-C 



Parameter 


Fast buffer 


Interleaved buffer 


Parity bytes per R-S codeword 


R F = 0,2,4,6,8,10,12,14,16 
(see NOTE 2) 


Rj = 0,2,4,6,8,10,12,14,16 
(see NOTE 1 and NOTE 2) 


DMT symbols per R-S codeword 


S= 1 


S= 1,2,4, 8, 16 


Interleave depth 


not applicable 


D= 1,2,4, 8, 16, 32,64 


NOTE 1 - R F can be > 0 only if K F > 0, and Ri can be > 0 only if Ki > 0 
NOTE 2 - Ri shall be an integer multiple of S. 



The ATU-C shall also support upstream transmission with at least any combination of the FEC coding capabilities 
shown in Table 23. 
3.4.6.1 Reed-nSolomon coding 

R (i.e., Rf or Rj) redundant check bytes cq . cj • C R-1 sna11 be appended to K (i.e., K f or SxAT, ) message 

bytes niQ, n%i m K-2- m K-l t0 form a Reed-Solomon code word of size N = K + R bytes. The check bytes are computed 

from the message byte using the equation: 
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where 

is the message polynomial, 
is the check polynomial, and 



C(D) = M(D) modulo G(D) 
M(D) - m 0 D^' 1 + nt} eft' 2 + ... + m K , 2 ^> + m K~l 
C(D) = cqDR' 1 + cj DA' 2 + ... + c R . 2 D + c R _j 



G(D) = P(D + a') 

is the generator polynomial of the Reed-Solomon code, where the index of the product runs from j r = 0 to RA. That is, C(D) is 
the remainder obtained from dividing M(D) tft by G(D). The arithmetic is performed in the Galois Field GF(256), where a is 
a primitive element that satisfies the primitive binary polynomial x^ + x^ + x^ + x 2 *!. A data byte (d 7 , d & ... , dj, d q) 

is identified with the Galois Field element dya 7 + d^a 6 ... +dja + d 0 The number of check bytes is R, and the codeword 
size N vary. 

3 .4.6.2 Reed-Solomon Forward Error Correction Superframe Synchronization 

When entering the SHOWTIME state after completion of Initialization and Fast Retrain, the ATU shall align the first 
byte of the first Reed Solomon code-word with the first data byte of DF 0. 

3.4.6.3 Interleaving 

The Reed-Solomon codewords in the interleaved buffer shall be convolutionally interleaved. The interleaving depth 
varies, but shall always be a power of 2. Convolutional interleaving is defined by the rule:" Each of the N bytes Bq, B\ , 2fy/_ 
j in a Reed-Solomon codeword is delayed by an amount that varies linearly with the byte index. More precisely, byte Bj (with 
index i) is delayed by (D-l ) x # bytes, where D is the interleave depth". 

An example for N = 5, D = 2 is shown in Table 15, where B?\ denotes the Mh byte of the /-th codeword. 
Table 15 - Convolutional interleaving example for N = 5. D = 2 



Inter- 
leaver 
input 


*0 


Bit 


B>2 


Bi 3 


Bi 4 


BJ+' 0 


Bl +1 j 


Bl + ' 2 


B> +I } 


Bl +1 4 


Inter- 
leaver 
output 


*0 


B>-1 3 


Bl, 


B>-> 4 


Bi 2 


BJ+'o 


Bf 3 


R> +1 , 


#4 


B/ +1 2 



With the above-defined rule, and the chosen interleaving depths (powers of 2), the output bytes from the interleaver 
always occupy distinct time slots when TV is odd. When N is even, a dummy byte shall be added at the beginning of the code- 
word at the input to the interleaver. The resultant odd-length code-word is then convolutionally interleaved, and the dummy 
byte shall then removed from the output of the interleaver. 
3.4.6.4 Support of higher downstream bit rates with S- 1 12 

With a rate of 4000 data frames per second and a maximum of 255 bytes (maximum R-S code-word size) per data 
frame, the ADSL downstream line rate is limited to approximately 8 Mbit/s per latency path. The line rate limit can be 
increased to about 16 Mbit/s for the interleaved path by mapping two RS code-words into one FEC data frame (i.e., by using 
5=1/2 in the interleaved path). 5=1/2 shall be used in the downstream direction only over bearer channel AS0. 
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When the K t data bytes per interleaved mux data frame cannot be packed into one RS code-word, i.e., Ki is such that Ki + R > 
255, the Ki data bytes shall be split into two consecutive RS code-words. When Ki is even, the first and second code-word 
have the same length Nil = N|2 = (Ki/2 + R ( ), otherwise the first code-word is one byte longer than the second, i.e. first code- 
word has Ni 1 = (Ki + 1 ))/2 + R, bytes, the second code-word has N|2 - (Kj - 1 yi + R, bytes. For the FEC output data frame, Ni 
5 = N t l + N]2, with Nr< 51 1 bytes. 

The convolutional interleaver requires all code-words to have the same odd length. To achieve the odd code-word 
length, insertion of a dummy (not transmitted) byte may be required. For S=l/2, the dummy byte addition to the first and/or 
second code-word at the input of the interleaver shall be as in Table 16. 



Table 16 - Dummy bvt e insertion at interleaver input forS = 1/2 



Njdl 


Hd2 


Dummy Byte Insertion Action 


odd 


odd 


no action 


even 


even 


Add one dummy byte at beginning of both code-words 


odd 


even 


Add one dummy byte at the beginning of the second code-word 


Even 


odd 


Add one dummy byte at the beginning of the first code-word and two dummy bytes at the 
beginning of the second code-word (the de-interleaver shall insert one dummy byte into the de- 
interleaver matrix on the first byte and the (D + 1 )th byte of the corresponding code-word to 
make the addressing work properly. ) 



3.4.7 Tone ordering 

A DMT time-domain signal has a high peak-to-average ratio (PAR) (its amplitude distribution is almost Gaussian), 
and large values may be clipped by the digital-to-analog converter. The error signal caused by clipping can be considered as an 
additive negative impulse for the time sample that was clipped. The clipping error power is almost equally distributed across 
1 5 all tones in the symbol in which clipping occurs. Clipping is therefore most likely to cause errors on those tones that, in 
anticipation of a higher received SNR, have been assigned the largest number of bits (and therefore have the densest 
constellations). These occasional errors can be reliably corrected by the FEC coding if the tones with the largest number of bits 
have been assigned to the interleave buffer. 

The numbers of bits and the relative gains to be used for every tone shall be calculated in the ATU-R receiver, and 
20 sent back to the ATU-C according to a defined protocol. The pairs of numbers are stored, in ascending order of frequency (or 
tone number i), in a bit and gain table. 

The ''tone-ordered 11 encoding shall first assign the 8*tf/r bits from the fast data buffer to the tones with the smallest 
number of bits assigned to them, and then the 8*#j bits from the interleave data buffer to the remaining tones. 

All tones shall be encoded with the number of bits assigned to them; one tone may therefore have a mixture of bits 
25 from the fast and interleaved buffers. 

The ordered bit table b'j shall be based on the original bit table b i as follows; 
For k = Oto 15 
{ 

From the bit table, find the set of all i with the number of bits per tone b f - - k 
30 Assign to the ordered bit allocation table in ascending order of / 

} 
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A complementary de-ordering procedure should be performed in the ATU-R receiver. It is not necessary, however, to 
send the results of the ordering process to the receiver because the bit table was originally generated in the ATU-R, and 
therefore that table has all the information necessary to perform the de-ordering. 

Figure 67 and Figure 68 show an example of tone ordering and bit extraction (without and with trellis coding 
respectively) for a 6-tone DMT case, with Np= 1 and Nj=l for simplicity. 
3.4.8 Constellation encoder (Trellis code version) 

Block processing of Wei's 16-state 4-dimensional trellis code is optional in the ITU-T recommendation G.992.1 to 
improve system performance. An algorithmic constellation encoder shall be used to construct constellations with a maximum 
number of bits equal to A'downmax. where 8£ Ndwnnux £ 15). 
3.4.8.1 Bit extraction 

Data bytes from the data frame buffer shall be extracted according to a re-ordered bit allocation table b' u least 
significant bit first Because of the 4-dimensional nature of the code, the extraction is based on pairs of consecutive A'-, rather 
than on individual ones, as in the non-trellis-coded case. Furthermore, due to the constellation expansion associated with 
coding, the bit allocation table, />',-, specifies the number of coded bits per tone, which can be any integer from 2 to 1 5. Given 
a pair (x t y) of consecutive b), x+y-1 bits (reflecting a constellation expansion of 1 bit per 4 dimensions, or one half bit per 
tone) are extracted from the data frame buffer. These z - x+y-1 bits (t z , t z _j , .... /j ) are used to form the binary word u as 
shown in Table 1 7. The tone ordering procedure ensures x S y. Single-bit constellations are not allowed because they can be 
replaced by 2-bit constellations with the same average energy. Refer to 0 for the reason behind the special form of the word « 
for the case x = 0 , y > 1 . 



Table 17 - Forming the binary word u 



Condition 


Binary word / comment 


x> 1 ,y> 1 


u = ( f z* *z-l» — » M ) 




Condition not allowed 


x - 0 y y> 1 


u=r ('z>'z-l > — '2*°> 'b°> 


,v = 0 ,y- 0 


Bit extraction not necessary, no 
message bits being sent 


x = 0 T y = I 


Condition not allowed 


NOTE - t t is the first bit extracted from the data frame buffer 



The last two 4-dimensional symbols in the DMT symbol shall be chosen to force the convolutional encoder state to 
the zero state. For each of these symbols, the 2 LSBs of u are pre-determined, and only (x+y -3) bits shall be extracted from 
the data frame buffer and shall be allocated to t 3 Js. ... t z . 
3.4.8.2 Bit conversion 

The binary word u=(u z >,u z »_j uj) determines two binary words v=^v z *.^...,v^ and >f(w j,...,^), which are 

used to look up two constellation points in the encoder constellation table. For the usual case of x > 1 and y >1, z* ~ z = .r-rv-1 , 
and v and w contain x and y bits respectively. For the special case of x = 0 and y > 1 , z* = z+2 =y+\, v - (vj ,vq) = 0 and w=(vr v _ 
1 .—♦>*'q). The bits (i^i^," \ ) determine (vj ,vq) and (wj ,wq) according to Figure 69. 

The convolutional encoder shown in Figure 69 is a systematic encoder (i.e. uj and w 2 are passed through unchanged) 
as shown in Figure 70. The convolutional encoder state (S 3 . S 2 . S } . Sq) are used to label the states of the trellis shown in 
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Figure 72. At the beginning of a DMT symbol period the convolutional encoder state is initialized to (0 r 0, 0, 0). The remaining 
bits of v and w are obtained from the less significant and more significant parts of (uy, u^.j , .. ,« 4 ), respectively. When x >\ 

andy> 1, v - (u zLyi . 2 . « 2 -.y+i "4. v l. v 0 ) and w - (u^ w^, ii 2 .. yf3 , w h w Q ). When * = 0, the bit extraction and 

conversion algorithms have been judiciously designed so that = v 0 = 0. The binary word v is input first to the constellation 
5 encoder, and then the binary word w. 

In order to force the final state to the zero state (0,0,0,0), the 2 LSBs iq and u 2 of the final two 4-dimensional 
symbols in the DMT symbol are constrained to wj = 5j ^ S3, and « 2 = S 2 * 
3,4.8.3 Coset partition and trellis diagram 

In a trellis code modulation system, the expanded constellation is labeled and partitioned into subsets ("cosets") using 
10 a technique called mapping by set-partitioning. The four-dimensional cosets in Wei's code can each be written as the union of 
two Cartesian products of two 2 -dimensional cosets. For example, C 4 ° = (C 2 ° * C 2 1 ) x (C 2 2 « C 2 3 ). The four constituent 2- 
dimensional cosets, denoted by C 2 °> C 2 1 , C 2 2 , C 2 3 , are shown in Figure 71. 

The encoding algorithm ensures that the 2 least significant bits of a constellation point comprise the index 1 of the 2- 
dimensional coset C 2 ' in which the constellation point lies. The bits (vj, vq) and (wj. wq) are in fact the binary representations 
15 of this index. 

The three bits (u2,uj,uq) are used to select one of the 8 possible four-dimensional cosets. The 8 cosets are labeled 
C 4 l where i is the integer with binary representation (u^u j,uq). The additional bit 1/3 (see Figure 69) determines which one of 
the two Cartesian products of 2-dimensional cosets in the 4-dimensional coset is chosen. The relationship is shown in Table 
1 8. The bits (vj ,vq) and (w j ,wq) are computed from (u 3 ,» 2 ,« j , H q) using the linear equations given in Figure 69. 
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Table 1 8 - Relation between 4 "dimensional and 2-dimensional cosets 



4-D Coset 


"3 "2 U 1 K 0 


v l v 0 


w l w o 


2-D Cosets 




0 0 0 0 
10 0 0 


0 0 

1 1 


0 0 

1 1 


c 2 ° . c 2 o 

C 2 3 . C 2 3 


C4 4 


0 10 0 
110 0 


0 0 

1 1 


1 1 

0 0 


C 2 °,C 2 3 
C 2 3 , C 2 ° 


<=4 2 


0 0 10 
10 10 


1 0 
0 1 


1 0 
0 1 


C 2 2 . C 2 2 

c 2 i . c 2 » 


C4 6 


0 110 
1110 


1 0 
0 1 


0 1 

1 0 


C 2 2 . C 2 > 
C 2 1 . C 2 2 




u u u 1 
10 0 1 


u u 
1 1 


1 A 

1 u 
0 1 


C 2 0 , C 2 2 

c 2 3 « c 2 > 




0 10 1 
110 1 


0 0 

1 1 


0 1 

1 0 


c 2 o.c 2 ' 

C 2 3.C 2 2 


C 4 3 


0 0 11 
10 11 


1 0 
0 1 


0 0 

1 1 


C 2 2 . C 2 0 

c 2 ' « c 2 3 


C4 7 


0 111 

1111 


1 0 
0 1 


1 1 

0 0 


C 2 2 . C 2 3 

c 2 » - c 2 o 



Figure 72 shows the trellis diagram based on the finite state machine in Figure 70, and the one-to-one correspondence 
5 between («2, u^uq) and the 4-dimensional cosets. In the figures, S~fS^,S2»S\ t SQ) represents the current state, while T = (7*3, 
7*i , 7q) represents the next state in the finite state machine. S is connected to T in the constellation diagram by a branch 
determined by the values of and 11 j. The branch is labeled with the 4-dimensional coset specified by the values of «j 
(and «Q = Sq , see Figure 71). To make the constellation diagram more readable, the indices of the 4-dimensional coset labels 
are listed next to the starting and end points of the branches, rather than on the branches themselves. The leftmost label 
1 0 corresponds to the uppermost branch for each state. The constellation diagram is used when decoding the trellis code by the 
Viterbi algorithm. 
3.4.8.4 Constellation encoder 

For a given sub-carrier, the encoder shall select an odd-integer point (.V. J") from the square-grid constellation based 
on the b bits of either { v £.i. v £>-2> — » v l» v o) or l M fc-l» M fc-2» — » w l» w 0'- ^or convenience of description, these b bits are 
1 5 identified with an integer label whose binary representation is {yt>-\^ v b-2* ...,vi,vq), but the same encoding rules apply also to 
the w vector. For example, for 6=2, the four constellation points are labeled 0,1,2,3 corresponding to (vj.vq) = (0,0), (0,1), 
( 1 ,0), (1,1 ), respectively (v 0 is the first bit extracted from the buffer). 



62 



O lV 3 7 H A 4 y a'-Tn iO 



WO 00/07323 PCT/US99/17369 

3.4.8.4.1 Even values of b 

For even values of b, the integer values X and Y of the constellation point (X, Y) shall be determined from the b bits 
\?b-h v b-2> v l' v 0) 88 follows - ^' and ^ are the odd integers with twos-complement binary representations (v^.j, vfc_3,..., 
vi, l)and(v^_2, v^,...,vo, 1 ), respectively. The most significant bits (MSBs), v bA and v w , are the sign bits for X and Y. 
5 respectively. 

Figure 74 shows example constellations for b = 2 and b= 4. (The values of X and Y shown represent the output of 
the constellation encoder. These values require appropriate scaling such that 1 ) all constellations regardless of size represent 
the same RMS energy and 2) by the fine gain scaling before modulation by the IDFT ) 

The 4-bit constellation can be obtained from the 2-bit constellation by replacing each label n by a 2 * 2 block of labels 
10 as shown in Figure 74. The same procedure can be used to construct the larger even-bit constellations recursively. 

The constellations obtained for even values of b are square in shape. The least significant bits {v|, vq} represent the 
coset labeling of the constituent 2-dimensional cosets used in the 4 -dimensional Wei trellis code. 
3.4.8.4.2. Odd values of b. b = 3 

Figure 75 shows the constellation for the case b = 3 (the values of X and Y shown represent the output of the 
1 5 constellation encoder. These values require appropriate scaling such that 1 ) all constellations regardless of size represents the 
same RMS energy and 2) by the fine gain scaling before modulation by the IDFT). 
3A8.4.3. Odd values of b. b>3 

If b is odd and greater than 3 t the 2 MSBs of A' and the 2 MSBs of Y are determined by the 5 MSBs of the b bits. Let 
c = (fr+l)/2, then X and Y have the twos-complement binary representations {X Cv X c _\^ b _^ y v b ^..^v^\) and {Y c ,Y c .\ t v^ 
20 5,v /7 _ 7 ,v ft _ 9 ,...,v 2 ,v 0 ,l), whereX c and Y c are the sign bits of X and Y respectively. The relationship between X a X c _ x , Y^ Y c ^j 
and v£_j, v£_2» is shown in the Table 19, 
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Tabic 19. Determining the top 2 bits of Xand Y 





x c , X c _ [ 




0 0 0 0 0 

\J \J \J \J \J 


o o 


u u 


0 0 0 0 1 


n ft 


n ft 


0 0 0 1 0 


0 0 


ft ft 


0 0 0 1 1 


0 o 


O ft 


0 0 10 0 


0 0 


i i 


0 0 10 1 


00 


i i 


00110 


0 0 


I \ 


00111 


0 0 


\ ] 


0 10 0 0 


1 1 


0 0 


0 10 0 1 


I \ 


ft ft 


0 10 10 


1 ] 


0 0 


0 10 11 


1 1 


o n 


0 1100 


1 1 


\ \ 


0 110 1 


1 1 


\ ] 


0 1110 


1 1 




0 1111 


1 1 


] 1 


10000 


0 1 


0 0 


1000 1 


0 1 


0 0 


10010 


1 0 


0 0 


10011 


1 0 


0 0 


10 100 


00 


0 1 


10 10 1 


00 


1 0 


10 110 


00 


0 1 


10 111 


00 


1 0 


11000 


1 1 


0 1 


11001 


1 1 


1 0 


110 10 


1 1 


0 1 


110 11 


1 1 


1 0 


11100 


0 1 




1110 1 


0 1 




11110 


1 0 


1 1 


11111 


1 0 


1 1 



Figure 76 shows the constellation for the case b = 5 (the X and Y values are on a ±1, ±3 , ... grid. The values of X 
and Y shown represent the output of the constellation encoder. These values require appropriate scaling such that: 

1 ) all constellations regardless of size represents the same RMS energy and 

2) by the tine gain scaling before modulation by the IDFT. 
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The 7-bit constellation shall be obtained from the 5-bit constellation by replacing each label n by the 2x2 block of 
labels as shown in Figure 74. 

Again, the same procedure shall be used to construct the larger odd-bit constellations recursively. Note also that the 
least significant bits {v|, v 0 } represent the coset labeling of the constituent 2 -dimensional cosets used in the 4-dimensional 
5 Wei trellis code. 

3.4.9 Constellation encoder (No Trellis coding^ 

An algorithmic constellation encoder shall be used to construct constellations with a maximum number of bits equal 
to //downmax » where 8 <> Ndownmax £ 15. The constellation encoder shall not use trellis coding with this option. 

3.4.9 A BUextyaptjQn 

10 Data bits from the frame data buffer shall be extracted according to a re-ordered bit allocation table £>',-, least 

significant bit first. The number of bits per tone, A',-, can take any non-negative integer values not exceeding A/downmax> 211(1 
greater than 1. For a given tone b = b'j bits are extracted from the data frame buffer, and these bits form a binary word {v^_ 
1 > v 6-2' — » v l » v 0 J ■ ^ ne ** rst mt extracted shall be vo , the LSB. 

3.4. 10 Gain scaling 

1 5 For the transmission of data symbols gain scaling, g/ ( shall be applied as requested by the ATU-R and possibly 

updated during Showtime via a bit swap procedure. Only values of g t - equal to zero or within a range of approximately 0. 19 to 
1.33(i.e.,-14.5dBto+2.5dB) may be used. For the transmission of synchronization symbols, no gain scaling shall be applied 
to any sub-carrier. 

Each constellation point, (,V,\J'i). i.e. complex number^ ,* + jY t \ output from the encoder is multiplied by gf : 
20 Zi~gi(Xi+jYi) 

3.4.1 1 Modulation 
3.4.11.1 Sub-carriers 

The frequency spacing, Af, between sub-carriers is 4.3125 kHz, with a tolerance of +/- 50 ppm. 
3.4.11.1.1 Data sub-carriers 

25 The channel analysis signal allows for a maximum of 255 carriers (at frequencies n&f , n~ 1 to 255) to be used. The 

lower limit of n depends on both the duplexing and service options selected. For example, for ADSL above POTS service 
option, if overlapped spectrum is used to separate downstream and upstream signals, then the lower limit on n is determined by 
the POTS splitting filters; if frequency division multiplexing (FDM) is used, the lower limit is set by the downstream-upstream 
separation filters. 

30 3.4.11.1.2 Pilot 

Carrier #AT p aot (/pilot = 4.3125 x A/pM kHz) shall be reserved for a pilot; that is b{N^) - 0 and g(AViot) = 1. 

The data modulated onto the pilot sub-carrier shall be a constant {0,0} . Use of this pilot allows resolution of sample 
timing in a receiver modulo-8 samples. Therefore a gross timing error that is an integer multiple of 8 samples could still persist 
after a micro-interruption (e.g., a temporary short- circuit, open circuit or severe line hit); correction of such timing errors is 
35 made possible by the use of the synchronization symbol. 

3.4. 1 1 . 1 .3 Nvquist frequency 

The carrier at the Nvquist frequency (#256) shall not be used for user data and shall be real valued. 

3.4.11.1.4 DC 

The carrier at DC (#0) shall not be used, and shall contain no energy. 
40 3.4. 1 1 .2 Modulation by the inverse discrete Fourier transform (IDFT) 

The modulating transform defines the relationship between the 5 1 2 real values .r„ and the Z, . 
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*n = S eXP( l56" } Zf for n = 0 to 511 
i=0 

The constellation encoder and gain scaling generate only 255 complex values of Z/. In order to generate real values of x„ the 
input values (255 complex values plus zero at DC and one real value for Nyquist if used) shall be augmented so that the vector 
Z has Hermitian symmetry. That is, 

Z( = conj (Z'5/2-, ) for / = 257 to 51 1 

3.4. 1 1 .3 Synchronization symbol 

The synchronization symbol permits recovery of the frame boundary after micro-interruptions that might otherwise 
force retraining. The data symbol rate,/^ m /, = 4 kHz, the carrier separation, 4^= 4.3 125 kHz, and the EDFT size, N = 5 12, are . 
such that a cyclic prefix of 40 samples could be used. That is, 

(5 12 + 40) x 4.0 = 5 12 x 4.3 1 25 = 2208 

The cyclic prefix shall, however, be shortened to 32 samples, and a synchronization symbol (with a nominal length of 
544 samples) is inserted after every 68 data symbols. That is, 

(512 + 32) x 69 = (5 12 + 40) x 68 

The data pattern used in the synchronization symbol shall be the pseudo-random sequence PRD, (d n , for n = 1 to 5 12) 
defined by 

d n = 1 for n = 1 to 9 

d n = rf„_4 © d n j§ for n =10 to 5 12 

The first pair of bits {d\ and dj) shall be used for the DC and Nyquist sub-carriers (the power assigned to them is 
zero, so the bits are effectively ignored). The first and second bits of subsequent pairs are then used to define the AT/ and for 
i = 1 to 255 as shown in Table 20. 



Table 20. Mapping of two data bits into a 4QAM constellation 



d2i+\,d2i+2 


Decimal label 




0 0 


0 


+ + 


0 1 


1 


+ - 


1 0 


2 


- + 


1 1 


3 





The period of the PRD is only 511 bits, so d512 shall be equal to dl. The dl - d9 shall be re-initialized for each 
synchronization symbol, so each symbol uses the same data. Bits 129 and 130, which modulate the pilot carrier, shall be 
overwritten by {0,0}: generating the {+,+} constellation. 

The minimum set of sub-carriers to be used is the set used for data transmission (i.e., those for which b$ > 0). The 
data modulated onto each sub-carrier shall be as defined above; it shall not depend on which sub-carriers are used. 

3.4.12. Cyclic prefix 

The last 32 samples of the output of the IDFT (.r ;j for n = 480 to 5 1 1 ) shall be prepended to the block of 5 12 samples 
and read out to the digital-to-analog converter (DAC) in sequence. That is, the subscripts, of the DAC samples in sequence 
are 480 511,0 511. 

3.4.13. Transmitter dynamic range 

The transmitter includes all analog transmitter functions: the D/A converter, the anti-aliasing filter, the hybrid 
circuitry, and the high-pass part of the POTS or ISDN splitter. 
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3.4. 1 3. 1 . Maximum clipping rate 

The maximum output signal of the transmitter shall be such that the signal shall be clipped no more than 0.00001% 
of the time. 

3.4.13.2. Noise/Distortion floor 

5 The signal to noise plus distortion ratio of the transmitted signal in a given sub-carrier is specified as the ratio of the 

rms value of the tone in that sub-carrier to the rms sum of all the non-tone signals in the 4.3125 kHz frequency band centered 
on the sub-carrier frequency. This ratio is measured for each sub-carrier used for transmission using a MultiTone Power Ratio 
(MTPR) test as shown in Figure 77. 

Over the transmission frequency band, the MTPR of the transmitter in any sub-carrier shall be no less than (3jVd 0 wni 
10 + 20)dB, where jVdowni is defined as the size of the constellation (in bits) to be used on sub-carrier i . The niinimum 
transmitter MTPR shall be at least 38dB (corresponding to an tfdowni of 6) for any sub-carrier. 

Signals transmitted during normal initialization and data transmission cannot be used for this test because the DMT 
symbols have a cyclic prefix appended, and the PSD of a non-repetitive signal does not have nulls at any sub-carrier 
frequencies. A gated FFT-based analyzer could be used, but this would measure both the non-linear distortion and the linear 
1 5 distortion introduced by the transmit filter. Therefore this test will require that the transmitter be programmed with special 
software probably to be used during development only. 
3.5 ATU-R Functional Characteristics 

An ATU-R may support STM transmission or ATM transmission or both framing modes that shall be supported, depend 
upon the ATU-R being configured for either STM or ATM transport. If framing mode k is supported, then modes k-1, .... 0 
20 shall also be supported. 

During initialization, the ATU-C and ATU-R shall indicate a framing mode number 0, 1, 2 or 3 which they intend to use. 
The lowest indicated framing mode shall be used. 

An ATU-R may support reconstruction of a Network Timing Reference (NTR) from the downstream indicator bits. 
3,5. 1 STM Transmission Protocol Specific functionalities 
25 3.5.1.1 ATU-R input and output V interfaces for STM transport 

The functional data interfaces at the ATU-R are shown in Figure 78. Output interfaces for the high-speed downstream 
simplex bearer channels are designated AS0 through AS3; input-output interfaces for the duplex bearer channels are 
designated LS0 through LS2. There may also be a functional interface to transport operations, administration and maintenance 
(OAM) indicators from the CI to the ATU-R; this interface may physically be combined with the LS0 interface. 
30 3.5.1.2 Downstream simplex channels - Transceiver bit rates 

The simplex channels are transported in the downstream direction only; therefore their data interfaces at the ATU-R 
operate only as outputs. 

3.5. 1.3 Duplex channels - Transceiver bit rates 

The duplex channels are transported in both directions, so the ATU-R shall provide both input and output data 
3 5 interfaces, 

3.5. 1 .4 Framing Structure for STM transport 

An ATU-R configured for STM transport shall support the full overhead framing structure 0. The support of full 
overhead framing structure 1 and reduced overhead framing structures 2 and 3 is optional. 

Preservation of T-R interface byte boundaries (if present) at the U-R interface may be supported for any of the U-R 
40 interface framing structures. 

An ATU-R configured for STM transport may support reconstruction of a Network Timing Reference (NTR). 
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3.5.2 ATM Transport Protocol Specific functionalities 

3.5-2.1 — ATU-R input and outp ut V interfaces for ATM transport 

The ATU-R input and output T interfaces are identical to the ATU-C input and output interfaces, as shown in Figure 

79. 

3-5.2.2 — ATM Cell specific functionalities 

The ATM cell specific functionalities performed at the ATU-R shall be identical to the ATM cell specific 
functionalities performed at the ATU-C. 
3.5.2.3 Framing Structure for ATM transport 

An ATU-R configured for ATM transport shall support the full overhead framing structures 0 and 1. 

The ATU-R transmitter shall preserve T-R interface byte boundaries (explicitly present or implied by ATM cell 
boundaries) at the U-R interface, independent of the U-R interface framing structure. 

An ATU-R configured for ATM transport may support reconstruction of a Network Timing Reference (NTR). 
To ensure framing structure 0 interoperability between an ATM ATU-R and an ATM cell TC plus an STM ATU-C (i.e., 
ATM over STM), the following shall apply: 

• An STM ATU-C transporting ATM cells and not preserving V-C byte boundaries at the U-C interface shall indicate 
during initialization that frame structure 0 is the highest frame structure supported; 

• An STM ATU-C transporting ATM cells and preserving V-C byte boundaries at the U-C interface shall indicate 
during initialization that frame structure 0, 1, 2 or 3 is the highest frame structure supported, as applicable to the 
implementation; 

• An ATM ATU-R receiver operating in framing structure 0 can not assume that the ATU-C transmitter will preserve 
V-C interface byte boundaries at the U-C interface and shall therefore perform the cell delineation bit-by-bit. 

3.5.3 Network timing reference 

If the ATU-C has indicated that it will use indicator bits 20 to 23 to transmit the change of phase offset, the ATU-R 
may deliver the 8 kHz signal to the T-R interface 
3 -5-4 Framing 

Framing of the upstream signal (ATU-R transmitter) closely follows the downstream framing (ATU-C transmitter), 
but with the following exceptions: 

• There are no ASx channels and no AEX byte; 

• A maximum of three channels exist, so that only three Bp, B\ pairs are specified; 

• The minimum RS FEC coding parameters and interleave depth differ (see Table 23); 

• Four bits of the fast and sync bytes are unused (corresponding to the bit positions used by the ATU-C transmitter to 
specify synchronization control for the ASx channels) (see Table 21 and Table 22). 

• The four indicator bits for NTR transport are not used in upstream direction. 

Two types of framing are defined: full overhead and reduced overhead. Furthermore, two versions of full overhead 
and two versions of reduced overhead are defined. The four resulting framing structures are defined as for the ATU-C and are 
referred to as framing structures 0, 1 , 2 and 3. 

Requirements for framing structures to be supported, depend upon the ATU-R being configured for either STM or 
ATM transport. 

Outside the ASx/LSx serial interlaces data bytes are transmitted MSB first in accordance with ITU-T 
Recommendations G.703, G.707, 1.361, and 1.432. All serial processing in the ADSL frame (e.g., CRC, scrambling, etc.) shall, 
however, be performed LSB first, with the outside world MSB considered by the ADSL as LSB. As a result, the first incoming 
bit (outside world MSB) will be the first processed bit inside the ADSL (ADSL LSB) 
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3.5.4.1 Data symbols 

The ATU-R transmitter is functionally similar to the ATU-C transmitter, except that up to three duplex data channels 
are synchronized to the 4 kHz ADSL DMT symbol rate (instead of up to four simplex and three duplex channels as is the case 
for the ATU-C). The ATU-R transmitter and its associated reference points for data framing are shown in Figure 55 and Figure 

56. 

3.5.4. 1 . 1 Supertrame structure 

The supertrame structure of the ATU-R transmitter is identical to that of the ATU-C transmitter, shown in Figure 6 1 . 

The ATU-R shall support the indicator bits. The indicator bits, ib20-23, shall not transport NTR in the upstream 
direction and shall be set to 1 . 
3,5.4, 1.2. Frame structure (with full overhead^ 

Each data frame shall be encoded into a DMT symbol. As specified for the ATU-C shown in Figure 6 1 , each frame is 
composed of a fast data buffer and an interleaved data buffer, and the frame structure has a different appearance at each of the 
reference points (A, B, and C). The bytes of the fast data buffer shall be clocked into the constellation encoder first, followed 
by the bytes of the interleaved data buffer. Bytes are clocked least significant bit first. 

The assignment of bearer channels to the fast and interleaved buffers shall be configured during initialization with 
the exchange of a (Bp^l) pair for each data stream, where B F designates the number of bytes of a given data stream to allocate 
to the fast buffer, and Bj designates the number of bytes allocated to the interleaved data buffer. 

The three possible (Bp^l) pairs are £ F (LSx), #i(LSx) forX = 0, 1 and 2, for the duplex channels; they are specified 
as for the ATU-C. 
3.5.4. 1 .2. 1 Fast data buffer 

The frame structure of the fast data buffer is the same as that specified for the ATU-C with the following exceptions: 

♦ ASx bytes do not appear, 

• The AEX byte does not appear, 

The following shall hold for the parameters shown in Figure 80. 
C p (LS0) - 0 if B p (LS0) = 255 (li 1 1 1 1 1 h) 
= B F (LS0) otherwise 

L F =0 if BKLS0) = BKLSl ) = Bp(LS2) = 0 

= 1 otherwise 

K F = 1 + C p (LS0) + Bf(LS 1 ) + Bf(LS2) + Lf 

N p =K F + R F 

where Rp = number of upstream FEC redundancy bytes in fast path. 

At reference point A (the mux data frame) in Figure 55 and Figure 56 the fast buffer always contains at least the fast 
byte. This is followed by Bp(LSO) bytes of channel LSO, then £p(LSl) bytes of channel LSI, and 5p(LS2) bytes of channel 
LS2, and if any Bp(LSx) is non-zero, a LEX byte. 

When £p(LS0) - 255 (1 1 1 1 1 1 1 1 2 ), no separate bytes are included for the LS0 channel. Instead, the 16 kbit/s C 
channel shall be transported in every other LEX byte on average, using the synchronization byte to denote when to add the LEX 
byte to the LS0 bearer channel. 

/?p FEC redundancy bytes shall be added to the mux data frame (reference point A) to produce the FEC output data 
frame (reference point B), where R ? is given in the C-RATES1 signal options received from the ATU-C during initialization. 
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Because the data from the fast data buffer is not interleaved, the constellation encoder input date frame (reference point C) is 
identical to the FEC output data frame (reference point B). 
3.5.4. 1,2.2 Interleaved data buffer 

The frame structure of the interleaved data buffer is shown in Figure 81 for the three reference points that are defined 
in Figure 55 and Figure 56. This structure is the same as that specified for the ATU-C, with the following exceptions: 

• ASx bytes do not appear, 

• the AEX byte does not appear, 

The following shall hold for the parameters shown in Figure 81: 
C,(LS0) - 0 if B^LSO) = 255 ( 111 1 1 1 1 1 3 ) 
= B,(LS0) otherwise 

U = 0 if Bj(LSO) = B|(LS 1 ) - Bi(LS2 ) = 0 

= 1 otherwise 

Ki = 1 + C ( (LS0) + Bi(LS 1 ) + Bi(LS2 ) + Li 

N, = ( S * K x + Rj y S 

where Rj - number of upstream FEC redundancy bytes in interleaved path and S = number of mux data frames per FEC 
codeword. 

3.5.4.1.3 Cyclic redundancy check fCRC) 

Two cyclic redundancy checks (CRCs) - one for the fast data buffer and one for the interleaved data buffer - are 
generated for each superframe and transmitted in the first frame of the following superframe. Eight bits per buffer type (fast or 
interleaved) per superframe are allocated to the CRC check bits. These bits are computed from the k message bits using the 
equation: 

crc(D) = A/(D) D 8 modulo G(D), 

where 

M(D) = moC^" 1 + m\rfc' 2 + .... + m^D + 

is the message polynomial, 

G(D) - D 8 + D 4 + D 3 + D 2 + 1 

is the generating polynomial, 

crc(D) ~ cq Tp + c\D 6 + + C6 D + <r? 

The CRC bits are transported in the fast byte (8 bits) of frame 0 in the fast data buffer, and the sync byte (8 bits) of 
frame 0 in the interleaved data buffer. The bits covered by the CRC include; 

• for the fast data buffer: 

■ frame 0: LSx bytes (A" = 0, 1, 2), followed by the LEX byte; 

■ all other frames: fast byte, followed by LSx bytes {X - 0, 1 , 2), and LEX byte. 

• for the interleaved data buffer: 

■ frame 0: LSx bytes (X = 0,1,2), followed by the LEX byte; 

« all other frames: sync byte, followed by LSx bytes (X = 0, 1,2), and LEX byte. 

Each byte shall be clocked into the CRC least significant bit first. 

The CRC -generating polynomial, and the method of generating the CRC byte are the same as for the downstream 

data. 
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3,5,4.3 Synchronization 

If the bit timing base of the input user data streams is not synchronous with the ADSL modem timing base the input 
data streams shall be synchronized to the ADSL timing base using the synchronization control mechanism (consisting of 
synchronization control byte and the LEX byte). Forward-error-correction coding shall always be applied to the 
synchronization control byte(s). 

If the bit timing base of the input user data streams is synchronous with the ADSL modem timing base then the 
synchronization control mechanism is not needed. The synchronization control byte shall always indicate "no synchronization 
action". 

3.5.4.2. 1 . Synchronization for the fast data buffer 

Synchronization control for the fast data buffer can occur in frames 2 through 33 and 36 through 67 of an ADSL 
superframe, where the fast byte may be used as the synchronization control byte. No synchronization action is to be taken for 
those frames in which the fast byte is used for CRC, fixed indicator bits, or EOC. The format of the fast byte when used as 
synchronization control for the fast data buffer shall be as given in Table 2 1 . 

In the case where no signals are allocated to the interleaved data buffer, the sync byte carries the AOC data directly 
as shown in Figure 63. 



Table 2 1 - Fast bvte format for synchronization 



Bit 


Application 


Specific usage 


sc7-sc4 


not used 


set to "0 2 " until specified otherwise 


sc3, sc2 


LSx channel designator 


"00 2 " : channel LSO 
"01 2 ": channel LSI 
"\0 7 n : channel LS2 
" 1 h" : no synchronization action 


scl 


Synchronization control for the 
designated LSx channel 


"I2" : add LEX byte to designated LSx channel 
"O2" : delete last byte from designated LSx channel 


scO 


Synchronization/EOC designator 


'W : perform synchronization control as indicated in sc7-scl 
" 1 2 " : this byte is part of an EOC frame 



If the bit timing base of the input bearer channels (LSx) is synchronous with the ADSL modem timing base then 
ADSL systems need not perform synchronization control by adding or deleting LEX bytes to/from the designated LSx channels. 
The synchronization control byte shall indicate "no synchronization action" (i.e., sc7-0 coded "00001 IX0 2 *\ with X 
discretionary). 

When the data rate of the C channel is 16 kbit/s, the LSO bearer channel shall be transported in the LEX byte, using 
the "add LEX byte to designated LSx channel", with LSO as the designated channel, every other frame on average. 
3.5.4.2.2. Synchronization for the interleaved data buffer 

Synchronization control for the interleaved data buffer can occur in frames 1 through 67 of an ADSL superframe, 
where the sync byte may be used as the synchronization control byte. No synchronization action shall be taken during frame 0, 
where the sync byte is used for CRC, and frames when the LEX bvte carries the AOC. 

The format of the sync byte when used as synchronization control for the interleaved data buffer shall be as given in 
Table 22. In the case where no signals are allocated to the interleaved data buffer, the sync byte shall carry the AOC data 
directly, as shown in Figure 63. 
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Table 22 - Svnc bvte foimat for synchronization 



Bit 


Application 


Specific usage • 


sc7-sc4 


not used 


Set to "O^* 1 until snecitled nthprwiv 


sc3, sc2 


LSx channel designator 


"00 2 " : channel LSO 
"01 2 " : channel LSI 
M 10 2 " : channel LS2 
"1 h" : no synchronization action 


scl 


Synchronization control for the 
designated LSx channel 


M 1 2 n ; add LEX byte to designated LSx channel 
"0 2 " : delete last byte from designated LSx channel 


scO 


Synchronization/AOC designator 


"O2" : perform synchronization control as indicated in sc3-sc 1 
" h" : LEX byte carries ADSL overhead control channel data; a 

delete synchronization control may be allowed as indicated in 

sc3-scl 



When the data rate of the C channel is 16 kbit/s, the LSO bearer channel shall be transported in the LEX byte, using 
5 the "add LEX byte to designated LSx channel", with LSO as the designated channel, every other frame on average. 

If the bit timing base of the input bearer channels (LSx) is synchronous with the ADSL modem timing base then 
ADSL systems need not perform synchronization control by adding or deleting LEX bytes to/from the designated LSx channels, 
and the synchronization control byte shall indicate "no synchronization action". In this case, and when framing structure 1 is 
used, the sc7-0 shall always be coded "00001 IXX2*, with X discretionary. When scO is set to 1, the LEX byte shall carry AOC. 
1 0 When scO is set to 0, the LEX byte shall be coded OOie- The scO may be set to 0 only in between transmissions of 5 
concatenated and identical AOC messages. 
3.5.4.3 Reduced overhead framing 

The format described in 2.5.4.1.2 for full overhead framing includes overhead to allow for the synchronization of 
three LSx bearer channels. When the synchronization function described in 2.5.4.2 is not required, the ADSL equipment may 
1 5 operate in a reduced overhead mode. This mode retains all the full overhead mode functions except synchronization control. 
When using the reduced overhead framing, the framing structure shall be as defined in 2.4.4.3. 1 (when using separate fast and 
sync bytes) or2.4.4.3.2 (when using merged fast and sync bytes) 

3.5.5 Scramblers 

The data streams output from the fast and interleaved buffers shall be scrambled separately using the same algorithm 
20 as for the downstream signal. 

3.5.6 Forward error correction 

The upstream data shall be Reed-Solomon coded and interleaved using the same algorithm as for the downstream 

data. 

The ATU-R shall support upstream transmission with at least any combination of the FEC coding capabilities shown 
25 in Table 23. 
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Table 23 - Minimum FEC coding capabilities for ATU-R 



Parameter 


Fast buffer 


Interleaved buffer 


Parity bytes per R-S codeword 


R F = 0,2,4,6,8,10,12,14,16 
(see NOTE 2) 


Ri = 0,2,4,6,8,10,12,14,16 
(see NOTE 1 and NOTE 2) 


DMT symbols per R-S codeword 


S= 1 


S= 1,2,4, 8, 16 


Interleave depth 


not applicable 


D= 1,2,4, 8 


NOTE 1 -/^/rcanbe>OonlyifA:yr>Oandi? / canbe>Oonlyif/:/>0 
NOTE 2 - Ri shall be an integer multiple of S. 



The ATU-R shall also support upstream transmission with at least any combination of the FEC coding capabilities 
shown in Table 14. 

3.5.7 Tone ordering 

The tone-ordering algorithm shall be the same as for the downstream data. 

3.5.8 Constellation encoder - Trellis version 

Block processing of Wei's 16-state 4-dimensional trellis code to improve system performance is optional. An 
algorithmic constellation encoder shall be used to construct constellations with a maximum number of bits equal to Af U pmax> 
where 8 <, A\jpmax ^ 15. 

The encoding algorithm shall be the same as that used for downstream data (with the substitution of the constellation 
limit of ATupmax for ^downmax)- 

3.5.9 Constellation encoder - Uncoded version 

An algorithmic constellation encoder shall be used to construct constellations with a maximum number of bits equal 
to Wupmax » where 8 < Af upm ax ^ 15. The encoding algorithm is the same as that used for downstream data (with the 
substitution of the constellation limit of Af upm ax for ^downmax) The constellation encoder shall not use trellis coding with 
this option. 

3.5.10 Gain scaling 

For the transmission of data symbols gain scaling, g t - shall be applied as requested by the ATU-C and possibly 
updated during Showtime via the bit swap procedure. Only values of g,* equal to 0 or within a range of approximately 0.19 to 
1.33 (i.e., -14.5 dB to +2.5 dB) may be used. For the transmission of synchronization symbols, no gain scaling shall be applied 
to any sub-carrier. 

Each constellation point, (A7,//), i.e. complex number, A" / + jY^ output from the encoder is multiplied by gj : 

Zi=gi(Xi+jYi) 

3.5.1 1. Modulation 

Frequency spacing, Af y between sub-carriers shall be 4.3 125 kHz with a tolerance of +/- 50 ppm. 
3.5.1 1.1. Sub-carriers 
3.5.11.1.1. Data sub-carriers 

The channel analysis signal allows for a maximum of 3 1 carriers (at frequencies «40 to be used. The range of n 
depends on the service option selected. For example, for ADSL above POTS the lower limit is set by the POTS/ADSL splitting 
filters; the upper limit is set by the transmit and receive band-limiting filters, and shall be no greater than 31. The cut-ofT 
frequencies of these filters are at the discretion of the manufacturer because the range of usable n is determined during the 
channel estimation. 
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3.5.11.1.2 Nyquist frequency 

The sub-carrier at the Nyquist frequency shall not be used for user data and shall be real valued, 

3.5.11.1.3 DC 



The sub-carrier at DC (#0) shall not be used, and shall contain no energy. 
5 3.5. 1 1.2. Synchronization symbol 

The synchronization symbol permits recovery of the frame boundary after micro-interruptions that might otherwise 
force retraining. 

The data symbol rate, fsymb ~ 4 kHz, the sub-carrier separation, Af = 4.3125 kHz, and the 1DFT size, N = 64. are 
such that a cyclic prefix of 5 samples could be used. That is, 
10 (64 + 5) * 4.0 = 64 * 4.3125 = 276 

The cyclic prefix shall, however, be shortened to 4 samples, and a synchronization symbol (with a nominal length of 68 
samples) inserted after every 68 data symbols. That is, 

(64 + 4)* 69 = (64 + 5)* 68 

The roinimum set of sub-carriers to be used is the set used for data transmission (i.e., those for which bj > 0); sub-carriers for 
1 5 which bj = 0 may be used at a reduced PSD. The data modulated onto each sub-carrier shall be as defined above; it shall not 
depend on which sub-carriers are used. 
3.5.12. Transmitter dynamic range 

The transmitter includes all analog transmitter functions: the D/A converter, the anti-aliasing filter, the hybrid 
circuitry, and the POTS splitter. 
20 3.5.12.1. Maximum clipping rate 

The maximum output signal of the transmitter shall be such that the signal shall be clipped no more than 0.00001% 
of the time. 

3.5.12.2 Noise/Distortion floor 



25 ratio of the rms value of the full-amplitude tone in that sub-carrier to the rms sum of all the non-tone signals in the 4.3125 kHz 
frequency band centered on the sub-carrier frequency. This ratio is measured for each sub-carrier used for transmission using a 
Multi-Tone Power Ratio (MTPR) test as shown in Figure 77. 

Over the transmission frequency band, the MTPR of the transmitter in any sub-carrier shall be no less than 
(^^upi + 20) dB, where A^pj is defined as the size of the constellation (in bits) to be used on sub-carrier /. The transmitter 

30 MTPR shall be +38dB (corresponding to an N U p{ of 6) for any sub-carrier. 

Signals transmitted during normal initialization and data transmission cannot be used for this test because the DMT 
symbols have a cyclic prefix appended, and the PSD of a non-repetitive signal does not have nulls at any sub-carrier 
frequencies. A gated FFT-based analyzer could be used, but this would measure both the non-linear distortion and the linear 
distortion introduced by the transmit filter. Therefore this test will require that the transmitter be programmed with special 

35 software, probably to be used during development only. 

4.- Use of SMCCC and PMCCC for wired communications as xDSL. 

We now describe the use of SMCCC and PMCCC for xDSL systems and. in particular, apply it to the case of ADSL 
transceivers. Hie use for other xDSL systems and other data communications systems is straightforward. It should be noted 
that the appended claims are not intended to be limited to the particular application of ADSL. 

40 An SMCCC is formed by two (or more) constituent systematic encoders joined through an interleaver. The input 

information bits feed the first encoder and, after having been scrambled by the interleaver, enter the second encoder. A code 
word of a serial concatenated code comprises of the input bits to the first encoder followed by the paritv check bits of both 



The signal to noise plus distortion ratio of the transmitted signal in a given sub-carrier ((S/N+D)i) is specified as the 
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encoders. SMCCC achieves near-Shannon-limit error correction performance. Here we describe the proposed encoder, the 

decoder and some simulation results. 

4.1- Parallel Multiple Convolutional Concatenated Codes. 

A PMCCC encoder is formed by two (or more) constituent systematic encoders joined through one or more 
interleaves. The input information bits feed the first encoder and, after having been scrambled by the interleaver, enter the 
second encoder. A code word of a parallel concatenated code comprises of the input bits to the first encoder followed by the 
parity check bits of both encoders. Here we present the proposed encoder, the decoder and some simulation results, the 
disadvantage of the PMCCC is that it has a floor-error around 10^. This could be improved with a good interleave! design, but 
using a large number of iterations. 

4.2. 1- Parallel Multiple Convolutional Concatenated Codes Encoder. 

A PMCCC encoder comprises of two parallel concatenated recursive systematic convolutional encoders separated by 
an interleaver. The encoders are arranged in a ''parallel concatenation". In a preferred embodiment, the concatenated recursive 
systematic convolutional encoders may be identical. 

Figure 82 represents the proposed encoder. The input is a block of information bits. The two encoders generate 
parity symbols (u 0 and u' 0 ) from two simple recursive convolutional codes. The key innovation of this technique is an 
interleaver *V\ which permutes the original information bits before input to the second encoder. The permutation performed 
by the interleaver allows those input sequences for which one encoder produces low-weight codewords to usually cause the 
other encoder to produce high-weight codewords. Thus, even though the constituent codes are individually weak, the 
combination is surprisingly powerful. The resulting code has features similar to a "random" block code. 

In this way, we have the information symbols (u; and u 2 ) and two redundant symbols (u 0 and u o). With this 
redundancy it is possible to reach longer loops and to reduce the PAR, at the cost of a slight increase of the constellation 
encoder. 

In the Figure 83 we have presented the conversion that we propose, taking into account the new parity bit. 

4.2.2- Parallel Multiple convolutional Concatenated Codes Decoder. 

In Figure 84 we present the decoder, that uses an iterative technique, with two soft-decision input/output trellis 
decoder in each decoding state. The Maximum-a-Posteriori (MAP) Trellis decoder provides the soft output result suitable for 
PMCCC decoding. 

The first decoder should deliver a soft output to the second decoder. The logarithm of the Likelihood Ratio (LLR) of 
a bit decision is the soft decision information output by the MAP decoder. 

Let u k be the binary random variable taking values in f0 r lj y representing the sequence of information bits u=(ui.„m„). The 
optimum decision algorithm on the kth bit «* is based on the conditional log-likelihood ratio Lt 

where P(uj) are the a priori probabilities. 

Using Bayes' rule and the following approximation: 

P(u\yi)* Y\7,(u k ) (108) 
*=/ 

The MAP algorithm approximates a nonseparable distribution with a separable one. It is possible to separate P(u)yd 

e u *Ln 

P,(u k ) = ~ Tv (109) 
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U = f(y } ,L 0 , L 2 ,k)+ Lok + L 7k (l 10) 

For binary modulation: 

— 2 Ay* 

L 0k = j- (ill) 

a 

Z P(y,\u)Y[e Ujil, >* l » > 

f(y,.Lo.L 2 .k) = log^ £ ( „2) 

Z P(y 1 \u)I\e Uj<L "* L » > 



u:uk=0 jxk 



and similarly: 



Lk = f(y 3 ,L 0 .L t ,k)+ L 0 k + Lit (H3) 

/<%.z 0 .z,.*; = io g ^ (ii4) 



Z P(y2\u)X\.e"*~ L <>+ 1 -»> 



u:u k =0 j'*k 



A solution to this equation is: 



L !k = f(y lt L 0 >L2,k) (115) 
~L 7k - f(y 2 >~L 0t ~L lt k) (116) 

for A= The final decision is based on: 

L k = ~Loic + ~L 7k (117) 
wliich is passed through a hard limiter with zero threshold. 

The nonlinear equations can be solve using the iterative procedure: 

lf»" = cc < r > f(y l .L 0 .Ly.k) (118) 
LT" = <z?>/(y,.L 0 ,L'r'.Q ("9) 

The recursion can be started with the initial condition: 

zr=zf -Zo (no 

For each iteration a/ m) and ai (m) can be optimized or set to I for simplicity. 
4.2.3. Design of the interleaver for PMCCC. 

In a PMCCC the interleaver establishes a relationship between portions of a codeword. It is generally assumed that 
when a PMCCC decoder is operating at low bit error rates, error sequences have small Hamming weights. From this, and 
properties of PMCCC, a mathematical structure is possible to developed for interleaver design, permitting the identification of 
quantitatively optimal interleaver. Simulations show the math captures some but not all the essential characteristics of a 
successful interleaver. Modifying a random interleaver according to some mathematical ideas gives excellent simulation 
results. 

The function of the interleaver in the PMCCC is to assure that at least one of the codeword components has high 
Hamming weight. For a better PMCCC, we can design an interleaver of permutation length p that maximizes the niinimum 
Hamming weight generated by weight two inputs. This requires maximizing: 

a n = min I j - i I ^ : n (j ) - n ( i) \ 1 £ \j ^ p (121) 

where n is the interleaver function. It is also possible to replace the sum with the maximum of: 
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s min \j - / \ v | it (j) - n (i) \ 1 £ ij <, p (122) 

An alternate method for interleaver design is lo disperse symbols as widely as possible in a "constellation way". One effective 
method is to choose s, and s 2 and generate /rone point at a time. For each i e fl.pj, taken sequentially, random values are 
considered for n(i) until one is found satisfying for s*=sj. In Figure 85, it is shown how in curve d the floor error can be 
avoided using this method. 

The constellation of the interleaver used to obtain curve *W" is presented in Figure 86. 
4.2.4- Simulations. 
Simulations with: 

(a) two equal, recursive convolutional consistent codes 

(b) with 16 states, 

(c) interleaver of length 4096 and 16384 

(d) using S-random permutation with S=3 1 and S=40 

(e) running each simulation at least 25 Mbits, 

shows that the decoding algorithm converges down to BER= 10* 3 at Et/N Q below 1 dB with nine iterations. 
4.3. Serial Multiple Concatenated Convolutional Codes: 

4.3.1 Encoder 

An SMCCC Encoder comprises of two serial concatenated recursive systematic convolutional encoders separated by 
an interleaver. The encoders are arranged in a "serial concatenation'*. The concatenated recursive systematic convolutional 
encoders are identical. 

Figure 87 represents the proposed encoder. A SMCCC encoder is a combination of two simple encoders. The input 
is a block of information bits. The two encoders generate parity symbols (u 0 and w ' 0 ) from two simple recursive convolutional 
codes. The key innovation of this technique is an interleaver *Y* , which permutes the original information bits before input to 
the second encoder. The permutation allows those input sequences for which one encoder produces low-weight codewords 
which will usually cause the other encoder to produce high-weight codewords. Thus, even though the constituent codes are 
individually weak, the combination is surprisingly powerful. The resulting code has features similar to a "random" block code. 

In this way, we have the information symbols («/ and io) and two redundant symbols (u 0 and « <,). With this 
redundancy it is possible to reach longer loops and to reduce the peak to average ratio (PAR) , at the cost of a slight increase of 
the constellation encoder . 

In Figure 83 we present the conversion that we propose, taking into account the new parity bit. 

4.3.2 Decoder 

In Figure 88, the block diagram of an iterative decoder is shown. It is based on two modules denoted by "SISO" one 
for each encoder, an interleaver, and a deinterleaver. The SISO module is a four-port device, with two inputs and two outputs. 
It accepts as inputs the probability distributions of the information and code symbols labeling the edges of the code trellis, and 
forms as outputs an update of these distributions based upon the code constraints. The updated probabilities of the input and 
code symbols are used in the decoding procedure. 

The SISO module is a four-port device that accepts at the input the sequences of probability distributions and outputs 
the sequences of probability distributions based on its inputs and on its knowledge of the code. The output probability 
distributions represent a smoothed version of the input distributions. The algorithm is completely general and capable of 
coping with parallel edges and also with encoders with rates greater than one, like those encountered in some concatenated 
schemes. 

The SISO algorithm requires that the whole sequence has been received before starting the smoothing process. The 
reason is that backward recursion starts from the final trellis state. A more flexible decoding strategy is offered bv modifying 
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a 



the algorithm in such a way that the SISO module operates on a fixed memory span and outputs the smoothed probability 
distributions after a given delay, D. This new algorithm is called the sliding-window soft-input soft-output (SW-SISO) 
algorithm 

The SW-SISO algorithm solves the problems of continuously updating the probability distributions, without requiring 
5 trellis terminations. Their computational complexity in some cases is around 5 times that of other suboptimal algorithms like 
SOVA. This is due mainly to the fact that they are multiplicative algorithms. In this section, we overcome this drawback by 
proposing the additive version of the SISO algorithm. 
4.3.3. Interleaver design. 

SMCCC does not have a problem with floor errors as does PMCCC. The floor error begins after 10* 7 that made it 
1 0 suitable for ADSL applications. In an SMCCC the interleaver establishes a relationship between portions of a codeword. For 
good SMCCC, we can design an interleaver of permutation length ** p " that rnaximizes the nunimum Hamming weight 
generated by weight two inputs. In an SMCCC, the interleaver establishes a relationship between portions of a code-word In 
the SMCCC case, because one of the inputs come from the outer encoder, the roll of the interleaver is not so critical; for this 
reason the method proposed for the interleaver is to disperse symbols as widely as possible in a "constellation way". One 
1 5 effective method is to choose for each ie [Up] n(i)~p/3*i. An example of this method is show in Figure 89. 
4.3.4- Simulations, 

Simulations with two equal, recursive convolutional consistent codes with 16 states and an interleaver of length 
between 100 and 1000 using S-randorn permutation, and each simulation run examined at least 25 Mbits show that the 
decoding algorithm converges down to BER-10* 7 at Et/N 0 of below 1 dB with less than nine iterations. 

20 4.4 The number of iterations in the decoder. 

The number of iterations is a very important subject for the different applications of PMCCC and SMCCC. For 
applications where the delay is not important, a large number is acceptable. For real time applications or for quasi-real time 
applications it is important to use a number of iterations as low as possible maintaining the advantages of this technique. The 
necessary number of iterations depends upon the Ei/N* ratio in the receiver. In Figure 51, we present this relationship for the 

25 SMCCC case, we represent values of Ei/N 0 below 0.1 dB, for values around 2 dB it is sufficient to use a number below 10 
iterations. 

4. 5. Comparisons 

The PMCCC has a floor-error around a BER of 10" 6 , The reason for this is that the SMCCC functions in an inner and 
outer encoder structure, while the PMCCC functions as two parallel encoders. In Figure 52 we present the floor error effect for 
30 PMCCC and that SMCCC does not show the floor error effect at least until BER of 10* 9 . For simulation after 10* 9 a lot of time 
is required and it is not possible to give a simulation result. 

5. Reed-Solomon Codes and Turbo Codes for ADSL systems 
5.1 Encoder 

Figure 90 represents the proposed encoder. A PMCCC encoder is a combination of two simple encoders. The input 
35 is a block of information bits. The two encoders generate parity symbols (uo and u' 0 ) from two simple recursive convolutional 
codes. The key innovation of this technique is an interleaver which permutes the original information bits before input to 
the second encoder. The permutation performed by the interleaver allows those input sequences for which one encoder 
produces low-weight codewords to usually cause the other encoder to produce high-weight codewords. Thus, even though the 
constituent codes are individually weak, the combination is surprisingly powerful. The resulting code has features similar to a 
40 random*' block code. 

In this way, we have the information symbol (ii,) and two redundant symbols (uo and u v 0 ) With this redundancy it is 
possible to reach longer loops, or works at higher bit rates in the same loop, at the cost of a slight increase of the constellation 
encoder. 
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In a preferred embodiment, the reason we suggest the use two 1/2 convolutional encoders in parallel, is that this 
simplifies the Turbo-code encoder and it still produces results very close to the channel capacity. The use of two 2/3 
convolutional encoders in parallel will produce a little better result (in the order of 0.1 or 0.2 dB) and will increase the 
complexity at least by four. From a practical point of view we think that is not necessary. 

In Figure 83, we have presented the conversion that we propose, taking into account the new parity bit. 
5.2 Decoder. 

In Figure 84, we present the decoder that uses an iterative technique, using two soft-decision input/output trellis 
decoders in each decoding state. The Maximum-a-Posteriori (MAP) Trellis decoder provides the soft output result suitable for 
turbo-code decoding. 
5.3. Simulations. 

In Figure 91, we present the simulation results that we obtained, with two equal, recursive convolutional consistent 
codes with an interleaver of length 400 and AT=50. With this value the delay is below 5 msec for 1 .5 Mbit/s assuming a delay of 
3 interleaver. 

The number of Iterations is always below 10. 

The convolutional encoder used is presented in Figure 92. 
^Forward Error Corr ection with Low-densitv paritv-check codes 

Low-density parity-check codes are codes specified by a matrix containing mostly 0's and only a small number of 1 's. 
In particular, an (n, j, k) low-density code is a code of block length n with a matrix like that of Table 24 where each column 
contains a small fixed number, j, of Vs and each row contains a small fixed number, k, of T s. Note that this type of matrix 
does not have the check digits appearing in diagonal form as in Table 25. However, for coding purposes, the equations 
represented by these matrices can always be solved to give the check digits as explicit sums of information digits. 

Table 24 Example of a low-densitv code matrix: N = 20. i = 3. k = 4 
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Tables 25 . Fxampl e of paritv-check matrix. 
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These codes are not optimum in the somewhat artificial sense of niinimizing probability of decoding error for a given 
block length, and it can be shown that the maximum rate at which these codes can be used is bounded below channel capacity. 
However, a very simple decoding scheme exists for low-density codes, and this compensates for their lack of optimaiity. 

The analysis of a low-density code of long block length is difficult because of the immense number of code words 
involved. It is simpler to analyze a whole ensemble of such codes because the statistics of an ensemble permit one to average 
over quantities that are not tractable in individual codes. From the ensemble behavior, one can make statistical statements 
about the properties of the member codes. Furthermore, one can with high probability fmd a code with these properties by 
random selection from the ensemble. 

In order to define an ensemble of (n, j, k) low-density codes, consider Table 24 again. Note that the matrix is divided 
into j sub-matrices, each containing a single 1 in each column. The first of these sub-matrices contains all its Ts in descending 
order; i.e., the i* th row contains Ts in columns (i - l)k + / to ik. The other sub-matrices are merely column permutations of the 
first. We define an ensemble of (n t J, k) codes as the ensemble resulting from random permutation of the columns of each of the 
bottom / - 7 sub-matrices of a matrix such as Table 24, with equal probability assigned to each permutation. There are two 
interesting results that can be proven using this ensemble, the first concerning the minimum distance of the member codes, and 
the second concerning the probability of decoding error. 

The minimum distance of a code is the number of positions in which the two nearest code words differ. Over the 
ensemble, the minimum distance of a member code is a random variable, and it can be shown that the distribution function of 
this random variable can be over bounded by a function. As the block length increases, for fixed j £ 3 and k > j, this function 
approaches a unit step at a fixed fraction S, k of the block length. Thus, for large n. practically all the codes in the ensemble have 
a minimum distance of at least n<5*,. In Table 26 this ratio of typical minimum distance to block length is compared to that for a 
parity-check code chosen at random, i.e., with a matrix filled in with equiprobable independent binary digits. It should be noted 
that for all the specific nonrandom procedures known for constructing codes, the ratio of minimum distance to block length 
appears to approach 0 with increasing block length. 

The probability of error using maximum likelihood decoding for low-density codes clearly depends upon the 
particular channel on which the code is being used. The results are particularly simple for the case of the BSC, or binary 
symmetric channel, which is a binary-input, binary-output, memoryless channel with a fixed probability of transition from 
either input to the opposite output. Here it can be shown that over a reasonable range of channel transition probabilities, the 
low-density code has a probability of decoding error that decreases exponentially with block length and that the exponent is the 
same as that for the optimum code of slightly higher rate as given in Table 27 . 
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Table 26, Comparison of <fy. the ratio of typical minimum distance to block length for an ( n. i. kt code, to S. the 



an ordinary parity-check code of the same rate. 

3 k liate 6 JK $ 



5 6 0.167 0.255 0.263 

4 5 0.2 0.210 0.241 

3 4 0.25 0.122 0.214 

4 6 0.333 0.120 0.173 
8 5 0.4 0.044 0.345 
3 6 0.5 0.023 0.H 



Table 27. Loss of rate associated with low-densitv codes. 

HATE FOB. EQUIVALENT 
J b Rate OPTIMUM CODE 



3 6 0.5 0-565 

3 5 0.4 0.43 

4 6 0.333 0.343 
3 4 0.25 0.266 



Although this result for the BSC shows how closely low-density codes approach the optimum, the codes are not 
designed primarily for use on this channel. The BSC is an approximation to physical channels only when there is a receiver that 
makes decisions on the incoming signal on a bit-to-bit basis. Since the decoding procedure to be described later can actually 
use the channel a posteriori probabilities, and since a bit-by-bit decision throws away available information, we are actually 
interested in the probability of decoding error of a binary-input, continuous-output channel. If the noise affects the input 
symbols symmetrically, then this probability can again be bounded by an exponentially decreasing function of the block length, 
but the exponent is a rather complicated function of the channel and code. It is expected that the same type of result holds for a 
wide class of channels with memory, but no analytical results have yet been derived. For channels with memory, it is clearly 
advisable, however, to modify the ensemble somewhat, particularly by permuting the first sub-matrix and possibly by changing 
the probability measure on the permutations. 
7. Application to Modem Communications Systems 

In a preferred embodiment, the use of PMCCC or SMCCC is negotiated independently in each direction of 
communication in the system. In the case of ADSL, this permits the use of trellis code in one direction and a SMCCC code in 
the other direction. 
Computer Program Listing 

This patent application includes a computer program listing containing 37 pages, and included as an appendix. The 
program relates to a Reed-Solomon Encoder and Decoder and a PMCCC Encoder and Decoder for 2 parallel concatenated 
convolutional codes. 

Thus it is seen that the objects, features and advantages of the present invention are efficiently obtained. The 
preferred embodiment described herein is intended to disclose the best mode of the invention and to teach those having 
ordinary skill in the art how to make and use the invention, but should not be interpreted as limiting the scope and spirit of the 
invention as embodied in the appended claims. 
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What We Claim Is: ' 

1 . A method of forward error correction for communication systems, comprising: 

producing a symbol stream by forward error coding of a data stream using multiple concatenated coders connected 
by an interleaver, all of the concatenated coders being of a single type that is one of convolutional and non-convolutional; 
5 modulating said symbol stream to produce a modulated signal; and, 

transmitting said modulated signal over a communication link. 

2. The method recited in Claim 1 wherein said communication system is a wired system. 

3. The method recited in Claim 1 wherein said communication system is an optical system. 

4. The method recited in Claim 1 wherein the modulating is accomplished with a multicarrier method. 
10 5. The method recited in Claim 4 wherein the multicarrier method is a Discrete Multi-Tone (DMT) 



method. 



method. 



The method recited in Claim 1 wherein the modulating is accomplished with a CAP-QAM single carrier 



7. The method recited in Claim 1 wherein the modulating is accomplished with a Quadrature Amplitude 
15 Modulation (QAM) method. 

8. The method recited in Claim 1 wherein the modulating is accomplished using a Pulse Amplitude 
Modulation (PAM) method. 

9. The method recited in Claim 1 wherein said multiple concatenated coders are convolutional coders. 

10. The method recited in Claim 1 wherein a first of the concatenated coders is configured in parallel with 
20 the interleaver and a second of the concatenated coders. 

1 1 . The method recited in Claim 1 wherein a first of the concatenated coders is configured in series with the 
interleaver and a second of the concatenated coders. 

12. The method recited in Claim 1 wherein said multiple concatenated coders are non-convolutional coders. 

13. The method recited in Claim 12 wherein one of said non-convolutional coders comprises a Reed- 
25 Solomon encoder. 

14. The method recited in Claim 12 wherein one of said non-convolutional coders comprises a low density 
parity check encoder. 

1 6. A method of forward error correction for communication systems, comprising: 

producing a symbol stream by forward error coding of a data stream using an outer level non-convolutional coder 
30 and an inner level coder comprising multiple concatenated coders connected by an interleaver, all of the concatenated coders 
being of a single type that is one of convolutional and non-convolutional; 

J modulating said symbol stream to produce a modulated signal; and, 
transmitting said modulated signal over a communication link. 

1 7. The method recited in Claim 16 wherein said outer level coder comprises a Reed-Solomon Encoder. 
35 18. A method of peak power level reduction for communication systems utilizing a plurality of coders 

comprising the following steps: 

producing a peak reduced signal by encoding said data stream by said plurality of coders; 

modulating said peak reduced signal; and, 

transmitting the modulated peak reduced signal. 
40 19. A method of forward error correction for communication systems, comprising the following steps: 

producing a symbol stream by forward error coding of a data stream using multiple concatenated coders connected 
by an interleaver, all of the concatenated coders being of a single type that is one of convolutional and non-convolutional; 

modulating said symbol stream to produce a modulated signal; 
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transmitting said modulated signal over a communication link; 

receiving said modulated signal, where said received modulated signal includes errors; 
demodulating said received signal which includes errors; 

iteratively decoding said demodulated signal using multiple concatenated decoders connected by an interleaver 
5 and a de-interleaver, all of the concatenated decoders being of the same type as the concatenated coders; and, 
regenerating said data stream and eliminating said errors. 

20. A method of forward error correction for communication systems, comprising: 

receiving a modulated signal from a communications link, where said received modulated signal includes errors; 
demodulating said received signal which includes errors; 
10 iteratively decoding said demodulated signal using multiple concatenated decoders connected by an interleaver 

and a de-interleaver, all of the concatenated decoders being of a single type that is one of convolutional and non- 
con vol utional; and, 

regenerating said data stream and eliminating said errors. 

21. An apparatus for forward error correction for communication systems, comprising: 
1 5 multiple concatenated coders connected by an interleaver for producing a symbol stream by forward error coding 

of a datastream, all of the concatenated coders being of a single type that is one of convolutional and non-convolutional,; 
' a modulator for modulating said symbol stream to produce a modulated signal; and, 

a transmitteT for transmitting said modulated signal over a communication link. 

22. The apparatus recited in Claim 21 wherein said communication system is a wired system. 
20 23. The apparatus recited in Claim 21 wherein said communication system is an optical system. 

24. The apparatus recited in Claim 21 wherein said multiple concatenated coders are convolutional coders. 

25. The apparatus recited in Claim 21 wherein a first of the concatenated coders is configured in parallel 
with the interleaver and a second of the concatenated coders. 

26. The apparatus recited in Claim 2 1 wherein a first of the concatenated coders is configured in series with 
25 the interleaver and a second of the concatenated coders. 

27. The apparatus recited in Claim 21 wherein said multiple concatenated coders are non-convolutional 

coders. 

29. An apparatus for forward error correction for communication systems, comprising: 

an outer level non-convolutional coder and an inner level coder comprising multiple concatenated coders 
30 connected by an interleaver for producing a symbol stream by forward error coding of a data stream, all of the concatenated 
coders being of a single type that is one of convolutional and non-convolutional; 

a modulator for modulating said symbol stream to produce a modulated signal; and, 
a transmitter for transmitting said modulated signal over a communication link. 

30. The apparatus recited in Claim 29 wherein said outer level coder comprises a Reed-Solomon encoder. 
35 31. An apparatus for accomplishing peak power level reduction for communication systems utilizing a 

plurality of coders comprising: 

means for producing a peak reduced signal by encoding said data stream by said plurality of coders; 
means for modulating said peak reduced signal; and, 
means for transmitting the modulated peak reduced signal. 



32. An apparatus for forward error correction for communication systems, comprising: 

a receiver for receiving a modulated signal from a communications link, where said received modulated signal 
includes errors; 
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a demodulator for demodulating said received signal which includes errors; 

multiple concatenated decoders connected by an interleaver and a de-interleaver for iteratively decoding said 
demodulated signal, all of the concatenated decoders being of a single type that is one of convolutional and non- 
convolutional; and, 

means for regenerating said data stream and eliminating said errors. 
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